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(57) Abstract: The invention provides methods 
for isolating cell-type specific mRNAs by 
selectively isolating ribosomes or proteins that 
bind mRNA in a cell type specific manner, and, 
thereby, the mRNA bound to the ribosomes or 
proteins that bind mRNA. Ribosomes, which 
are riboprotein complexes, bind mRNA that is 
being actively translated in cells. According to 
the methods of the invention, cells are engineered 
to express a molecularly tagged ribosomal protein 
or protein that binds mRNA by introducing into 
the cell a nucleic acid comprising a nucleotide 
sequence encoding a ribosomal protein or protein 
that binds mRNA fused to a nucleotide sequence 
encoding a peptide tag. The tagged ribosome or 
mRNA binding protein can then be isolated, along 
with the mRNA bound to the tagged ribosome or 
mRNA binding protein, and the mRNA isolated 
and further used for gene expression analysis. The 
methods of the invention facilitate the analysis and 
quantification of gene expression in the selected 
cell type present within a heterogeneous cell 
mixture, without the need to isolate the cells of 
that cell type as a preliminary step. 
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METHOD FOR ISOLATING CELL-TYPE SPECIFIC mRNAs 

5 This application claims the benefit of application no. 60/340,689 filed October 29, 

2001, the entire disclosure of which is incorporated herein by reference in its entirety. 

1. TECHNICAL FIELD 

The present invention relates to methods for isolating cell-type specific mRNAs by 
10 isolating ribosomes in a cell-type specific manner. According to the methods of the 

invention, ribosomes or proteins that bind mRNA of the selected cell type are molecularly 
tagged and isolated, and the mRNA bound to the ribosomes or proteins that bind mRNA is 
then isolated and analyzed. The methods of the invention facilitate the analysis and 
quantification of gene expression in the selected cell type present within a heterogeneous 
1 5 cell mixture, without the need to isolate the cells of that cell type as a preliminary step. 

2. BACKGROUND OF THE INVENTION 

An important paradigm in the development of new diagnostics and therapies for 
human diseases and disorders is the characterization of the gene expression of defined cell 

20 types. The cellular complexity of many tissues (such as the nervous system), however, 
poses a challenge for those seeking to characterize gene expression at this level. The 
enormous heterogeneity of a tissue such as the nervous system (thousands of neuronal cell 
types, with non-neuronal cells outnumbering neuronal cells by an order of magnitude) is a 
barrier to the identification and analysis of gene transcripts present in individual cell types. 

25 One way to overcome this barrier is to tag gene transcripts directly or indirectly, I e. , 

mRNA, present in a particular cell type, in such a manner as to allow facile isolation of the 
gene transcripts without the need to isolate the individual cells of that cell type as a 
preliminary step. We describe such a technology here. 

30 3. SUMMARY OF THE INVENTION 

The invention provides methods for isolating cell-type specific mRNAs by 
selectively isolating ribosomes or proteins that bind mRNA in a cell type specific manner, 
and, thereby, the mRNA bound to the ribosomes or proteins that bind mRNA. Ribosomes, 
which are riboprotein complexes, bind mRNA that is being actively translated in cells. 
35 According to the methods of the invention, cells are engineered to express a molecularly 
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tagged ribosomal protein or protein that binds mRNA by introducing into the cell a nucleic 
acid comprising a nucleotide sequence encoding a ribosomal protein or proteins that bind 
mRNA fused to a nucleotide sequence encoding a peptide tag. The peptide tag can be any 
non-rib osomal protein peptide or non-mRNA binding protein peptide that is specifically 

5 bound by a reagent that either does not recognize a component of the cell fraction from 
which the tagged ribosomes or proteins that bind mRNA are to be isolated, for example, 
from a whole cell lysate or post-mitochondrial fraction (or any other ribosome or polysome 
preparation or other preparation containing the tagged protein that binds mRNA being 
analyzed). In a preferred embodiment, the polysome preparation is a membrane-associated 

1 0 polysome preparation. Specifically, the peptide tag may be an epitope that is recognized by 
an antibody that does not specifically bind any epitope expressed in a cell or 
ribosome/polysome fraction from an unengineered cell. As defined herein, specific binding 
is not competed away by addition of non-specific proteins, e.g., bovine serum albumen 
(BSA). The tagged ribosomal protein or mRNA binding protein is then expressed 

1 5 selectively in a cell population of interest (for example, by operably linking the nucleotide 
sequence encoding the tagged ribosomal protein or mRNA binding protein to a cell-type 
specific promoter and/or other transcriptional element). In a preferred embodiment, the 
tagged ribosomal protein or mRNA binding protein is overexpressed. 

Monosomes or polysomes (which are, respectively, single or multiple ribosomes in a 

20 complex with a single mRNA) or other mRNA-containing complex are isolated selectively 
from the cell population of interest through the use of the tagged ribosomal protein subunit 
or other mRNA binding protein. As used herein, isolated means that the ribosomes are 
separated from other cell components, specifically that the ribosomes are substantially free 
of untagged ribosomes and of RNA (particularly mRNA) not bound by ribosomes or mRNA 

25 binding protein. In particular, the composition is 50%, 60%, 70%, 80%, 90%, 95% or 99% 
tagged ribosome or mRNA binding protein and associated mRNA. The mRNA species that 
are bound to the cell-type specific ribosomes or mRNA binding protein are then isolated, 
and can subsequently be profiled and quantified, to analyze gene expression in the cell. In a 
specific embodiment, because nascent polypeptides are attached to isolated monosomes and 

30 polysomes, the methods of the invention can also be used to isolate newly synthesized 
polypeptides from a cell type of interest (e.g., for proteomic applications), for example, 
using antibodies that specifically recognize an epitope on a specific polypeptide being 
synthesized by the cell. 

In preferred embodiments, the invention provides transformed organisms (including 

35 animals, plants, fungi and bacteria), e.g., a transgenic animal such as a transgenic mouse, 



2 



WO 03/038049 



PCT/US02/34645 



that expresses one or more tagged ribosomal protein(s) or mRNA binding protein(s) within 
a chosen cell type. The invention also provides cultured cells that express one or more 
tagged ribosomal proteins or mRNA binding proteins. Cell-type specific expression is 
achieved by driving the expression of the tagged ribosomal protein using the endogenous 

5 promoter of a particular gene, wherein the expression of the gene is a defining characteristic 
of the chosen cell type (i.e., the promoter causes gene expression specifically in the chosen 
cell type). Thus, "cell-type" refers to a population of cells characterized by the expression 
of a particular gene. In a preferred embodiment, a collection of transgenic mice expressing 
tagged ribosomal proteins within a set of chosen cell types is assembled. Additionally, since 

10 the level of expression of the tagged ribosomal protein or mRNA binding protein within a 
cell may be important in the efficiency of the isolation procedure, in certain embodiments of 
the invention, a binary system can be used, in which the endogenous promoter drives 
expression of a protein that then activates a second expression construct. This second 
expression construct uses a strong promoter to drive expression of the tagged ribosomal 

1 5 protein or mRNA binding protein at higher levels than is possible using the endogenous 
promoter itself. 

In specific embodiments, the invention provides molecularly tagged ribosomes, 
preferably bound to mRNA, that are bound to an affinity reagent for the molecular tag. In 
more specific embodiments, the molecularly tagged ribosomes are bound to an affinity 

20 reagent which is bound to a solid support. In other particular embodiments, the invention 
provides molecularly tagged ribosomal proteins and mRNA binding proteins of the 
invention (and the ribosomes, ribosomal-mRNA complexes, and mRNA binding protein- 
mRNA complexes containing them); nucleic acids comprising nucleotide sequences 
encoding a molecularly tagged ribosomal protein or mRNA binding protein of the 

25 invention; vectors and host cells comprising these nucleic acids and tagged proteins and 
ribosomes of the inventions. 

The methods of the invention are advantageous because they permit the isolation of 
gene transcripts, or mRNA, present in a particular cell type, as defined by the common 
expression of a given gene, in such a manner as to allow their facile isolation without the 

30 need to isolate the individual cells of that cell type as a preliminary step. 

Additionally, in specific embodiments, the methods of the invention may be used to 
isolate other organelles or subcellular structures by molecularly tagging proteins integral to 
those organelles or structures. In a particular embodiment, the methods of the invention are 
used to isolate cell specific mRNAs for secreted, membrane bound and lysomal proteins by 

3 5 isolating tagged membrane bound ribosomes. 
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4. DESCRIPTION OF THE FIGURES 

FIG. 1 . Polysomes from cells transfected with plasmids expressing tagged versions 
of ribosomal proteins S6 (lane 2, in duplicate), L32 (lane 4, in duplicate), and L37 (lane 5, 
in duplicate) contain proteins that are reactive to the anti-streptag II antibodies. These 

5 proteins correspond to the predicted molecular weights of the S6 (34 kDa), L32 (52kDa), 
and L37 (9kDa) ribosomal proteins. The S6 and L37 proteins appear to be more abundantly 
represented in the polysomal fraction compared to the L32 protein. Tagged S20 (lane 3, in 
duplicate) does not appear to be present in the polysomal fraction. Polysomes from 
untransfected cells (lane 1, in duplicate) do not display any immunoreactive material. 

10 FIG. 2. Ribosomal RNA is present (arrow) in material immunoprecipitated from 

tagged S6 (lane 2) transfectants. Such RNA is also present at low levels in material from 
tagged L37 transfectants (lane 3). Such RNA is not present in material from untransfected 
cells (lane 1). 

1 5 5. DETAILED DESCRIPTION OF THE INVENTION 

The invention provides methods for isolating cell-type specific mRNAs by 
selectively isolating ribosomes, or other proteins that bind mRNA, in a cell type specific 
manner, and, thereby, the mRNA bound to the ribosomes or mRNA binding proteins. 
Ribosomes, which are riboprotein complexes, bind mRNA that is being actively translated 

20 in cells. According to the methods of the invention, preferably, cells are engineered to 
express a molecularly tagged ribosomal protein or mRNA binding protein by introducing 
into the cell a nucleic acid comprising a nucleotide sequence encoding a ribosomal protein 
or mRNA binding protein fused to a nucleotide sequence encoding a peptide tag. The 
peptide tag can be any peptide that is not from a ribosomal protein or mRNA binding 

25 protein and that is specifically bound by a reagent that does not recognize a component, 
other than the peptide tag, of the cell fraction from which the tagged ribosomes or mRNA 
binding proteins are to be isolated, for example, from a whole cell lysate or post- 
mitochondrial fraction (or any other ribosome or polysome preparation or preparation 
containing mRNA binding protein bound to mRNA being analyzed). For example, the 

30 peptide tag may be an epitope that is recognized by an antibody that does not specifically 
bind any epitope expressed in a cell or ribosome/polysome fraction (or other fraction) from 
an unengineered cell. As defined herein, specific binding is not competed away by addition 
of non-specific proteins, e.g., bovine serum albumen (BSA). 

The tagged ribosomal protein or mRNA binding protein is then expressed selectively 

35 in a cell population of interest (for example, by operably linking the nucleotide sequence 
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encoding the tagged ribosomal or mRNA binding protein to a cell-type specific promoter, 
enhancer and/or other transcriptional element). The fused nucleotide sequences may be 
under the control of a transcriptional element {e.g. , promoter or enhancer) that activates 
transcription specifically in the cell type of choice (for example, transcriptional regulatory 

5 elements that control expression of the gene, the expression of which characterizes the cell 
type of choice, termed herein the "characterizing gene"). In a preferred embodiment, the 
tagged ribosomal or mRNA binding protein is overexpressed. Cell-specific polysomes (or 
other fraction containing the tagged mRNA binding protein) containing the tag are purified, 
exploiting affinity of a purification reagent (e.g., an antibody or other biological compound 

10 that binds the tag) for the tag. The purification reagent can then be isolated itself or be 
bound to another structure, e.g., a bead, that can be isolated from other components in the 
cell, and bound mRNA is isolated from purified polysomes for subsequent gene expression 
analysis. 

15 5.1. MOLECULAR TAGGING OF RIBOSOMES AND mRNA BINDING 

PROTEINS 

The invention provides methods for isolating cell-type specific mRNA using 
molecularly tagged ribosomal proteins that become incorporated into the ribosomes of a 
particular cell type or molecularly tagged mRNA binding proteins that are expressed in a 

20 particular cell type of interest. Specifically, ribosomes and mRNA binding proteins can be 
molecularly tagged by expressing in the cell type of interest a ribosomal fusion protein or 
mRNA binding protein fusion protein containing all or a portion of a ribosomal or mRNA 
binding protein (preferably, the portion has the biological activity of the native ribosomal 
protein or mRNA binding protein, i.e., can function in an intact ribosome to carry out 

25 translation or binds mRNA) fused to (for example, through a peptide bond) a protein or 
peptide tag that is not a ribosomal protein or mRNA binding protein or portion thereof, or, 
preferably, found in the organism in which the tagged protein is being expressed. Such 
expression can be carried out by introducing into cells, or into an entire organism, a nucleic 
acid encoding the molecularly tagged ribosomal protein or mRNA binding protein, under 

30 the control of transcriptional regulatory elements that direct expression in the cell type of 
choice, or putting the expression of the ribosomal or mRNA fusion protein under the control 
of an endogenous promoter by homologous recombination or in a bacterial artificial 
chromosome ("BAG"). 

The invention further provides methods for isolating cell-type specific mRNA by 

35 tagging proteins that bind to mRNA, preferably actively translated mRNA. In a preferred 
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embodiment, the protein that binds mRNA is not poly A binding protein. In another 
embodiment, the protein that binds mRNA is a CAP binding protein or a processing factor 
that binds the 3 f untranslated region of the mRNA. In certain other embodiments, the 
ribosome or mRNA binding protein is molecularly tagged by engineering the ribosome or 

5 mRNA binding protein to bind a small molecule, e.g., a peptide, that is not significantly 
bound by the unengineered ribosome or mRNA binding protein. 

The nucleic acid encoding the ribosomal protein or other mRNA binding protein 
fused to the peptide tag can be generated by routine genetic engineering methods in which a 
nucleotide sequence encoding the amino acid sequence for the peptide tag sequence is 

10 engineered in frame with the nucleotide sequence encoding a ribosomal protein or mRNA 
binding protein. This can be accomplished by any method known in the art, for example, via 
oligonucleotide-mediated site-directed mutagenesis or polymerase chain reaction and other 
routine protocols of molecular biology (see, e.g., Sambrook et al, 2001, Molecular Cloning, 
A Laboratory Manual, Third Edition, Cold Spring Harbor Laboratory Press, N.Y.; and 

1 5 Ausubel et al , 1989, Current Protocols in Molecular Biology, Green Publishing Associates 
and Wiley Interscience, N. Y., both of which are hereby incorporated by reference in their 
entireties). In certain embodiments, the method of Walles-Granberg et al. (Biochim. 
Biophys. Acta, 2001, 1544(1-2): 378-385, which is incorporated herein by reference in its 
entirety) is used. 

20 The nucleotide sequence encoding the peptide tag is preferably inserted in frame 

such that the tag is placed at the N- or C-terminus of the ribosomal protein, since these 
portions of proteins are often accessible to detection or purification reagents. The peptide 
tag, however, may be inserted into any portion of the ribosomal protein such that when the 
protein is incorporated into an intact ribosome, the insertion of the tag does not prevent 

25 ribosomal function and the tag is accessible in the intact ribosome to the purification reagent 
to be used in the isolation. If a mRNA binding protein is used, the tag may be inserted into 
any portion of the protein such that the protein binds mRNA and the tag is accessible to the 
purification reagent. 

Encoded peptide tags can be any non-rib o so mal protein (or non-mRNA binding) 

30 peptide or protein (or portion thereof) that is not present and/or accessible in the cell of 
interest (or the cell fraction from which the tagged ribosomes or mRNA binding protein are 
to be affinity isolated) for which there exists an affinity reagent that recognizes the peptide 
and that is accessible to solution (and thereby, the peptide tag) in the intact ribosomes or 
mRNA binding protein bound to mRNA. 

35 
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Molecular tagging with epitopes ("epitope tagging") is well known in the art 
(reviewed in Fritze CE, Anderson TR. Epitope tagging: general method for tracking 
recombinant proteins. Methods Enzymol. 2000;327:3-16; Jarvik JW, Telmer CA. Epitope 
tagging. Annu Rev Genet. 1998;32:601-18). An epitope tag can be any peptide protein that 

5 is not normally present and/or accessible in the cell of interest (or other cells that will be 
contacted with the reagent that binds the tag) for which there exists an antibody that 
recognizes the protein, and that is accessible to solution in the intact ribosomes or mRNA 
binding protein-mRNA complexes. 

Peptide tags can include those for which methods/reagents exist that allow facile 

10 identification of the tagged ribosomal protein or mRNA binding protein, but are unlikely to 
inhibit or interfere with function of the tagged ribosomal protein or mRNA binding protein. 
The tag may be of any length that permits binding to the corresponding binding reagent, but 
does not interfere with the tagged proteins binding to the mRNA. In a preferred 
embodiment, the tag is about 8, 10, 12, 15, 18 or 20 amino acids, is less than 15, 20, 25, 30, 

15 40 or 50 amino acids, but may be 100, 150, 200, 300, 400 or 500 or more amino acids in 
length. The tag may be bound specifically by a reagent that does not bind any component 
of: (1) the cell of interest; or (2) a polysomal preparation of interest; or (3) whatever cellular 
fraction of interest is being contacted by the reagent that binds the tag. Molecular tags may 
include, by way of example, and not by limitation, protein A fragments; myc epitopes (Evan 

20 etaU Mol. Cell Biol. 5(12):3610-3616); Btag (Wang etal., 1996, Gene 169(1): 53-58; and 
polyhistidine tracts (Bornhorst et al, 2000, Purification of proteins using polyhistidine 
affinity tags, Methods Enzymol 326:245-54). Other preferred tags include, but are not 
limited to: 

(1) a portion of the influenza virus hemagglutinin protein (Tyr-Pro-Tyr-Asp-Val- 
25 Pro-Asp-Tyr-Ala; SEQ ID NO: 1). The reagent used for purification is a monoclonal 

antibody recognizing the tagged protein (12CA5) (Wilson LA, Niman HL, Houghten RA, 
Cherenson AR, Connolly ML, Lemer RA. The structure of an antigenic determinant in a 
protein. Cell. 1984 Jul;37(3):767-78). 

(2) a portion of the human c-myc gene (Glu-Gln-Lys-Leu-Ile-Ser-Glu-Glu-Asp-Leu; 
30 SEQ ID NO: 2). The reagent used for purification is a monoclonal antibody recognizing the 

tagged protein (9E10) (Evan GI, Lewis GK, Ramsay G, Bishop JM. Isolation of monoclonal 
antibodies specific for human c-myc proto-oncogene product. Mol Cell Biol. 1985 
Dec;5(12):3610-6). 

(3) a portion of the bluetongue virus VP7 protein (Gln-Tyr-Pro-Ala-Leu-Thr; SEQ 
35 ID NO: 3). The reagent used for purification is a monoclonal antibody recognizing the 
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tagged protein (Dl 1 and/or F10) (Wang LF, Yu M, White JR, Eaton BT. BTag: a novel 
six-residue epitope tag for surveillance and purification of recombinant proteins. Gene. 
1996 Feb 22;169(l):53-8) 

(4) a FLAG peptide (e.g., Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys; SEQ ID NO: 4). 

5 The reagent used for purification are monoclonal antibodies recognizing the tagged protein 
(e.g., Ml and/or M2) (Sigma) (Hopp et ah, U.S. Patent No. 4,703,004, entitled "Synthesis 
of protein with an identification peptide" issued October 27, 1987; Brizzard BL, Chubet 
RG, Vizard DL. Immunoaffinity purification of FLAG epitope-tagged bacterial alkaline 
phosphatase using a novel monoclonal antibody and peptide elution. Biotechniques. 1994 

10 Apr;16(4):730-5; Knappik A, Pluckthun A. An improved affinity tag based on the FLAG 
peptide for the detection and purification of recombinant antibody fragments. 
Biotechniques. 1994 Oct; 17(4): 754-761) 

(5) a Strep-tag peptide (e.g., Ala-Trp-Arg-His-Pro-Gln-Phe-Gly-Gly; SEQ ID NO: 
5). In a preferred embodiment, a strep-tag peptide is used. The reagent used for 

1 5 purification is one of several optimized versions of streptavidin that recognizes the tagged 
protein (IBA GmbH) (Skerra et al 9 U.S. Patent No. 5,506,121, entitled Fusion peptides with 
binding activity for streptavidin, issued April 9, 1996 ; Skerra A, Schmidt TG. Applications 
of a peptide ligand for streptavidin: the Strep-tag. Biomol Eng. 1999 Dec 3 l;16(l-4):79-86; 
Skerra A, Schmidt TG. Use of the Strep-Tag and streptavidin for detection and purification 

20 of recombinant proteins. Methods Enzymol. 2000;326:271-304). 

Any ribosomal protein or mRNA binding protein can be molecularly tagged for use 
in the methods of the invention, as described in this section, provided that when the 
ribosomal protein is molecularly tagged and incorporated into a ribosome, the ribosome can 
bind mRNA and, preferably, translate the mRNA into protein, or, when the mRNA binding 

25 protein is molecularly tagged, it can bind mRNA. In addition, the tag of the tagged 

ribosomal protein or mRNA binding protein must be accessible to the purification reagent, 
so that the reagent can be used to purify the intact ribosomes or mRNA binding protein- 
mRNA complexes. Preferably, the ribosomal protein or mRNA binding protein to be 
tagged is from the same species as the cell that is to express the molecularly tagged protein. 

30 Nucleic acids encoding the molecularly tagged ribosomal proteins and mRNA 

binding proteins of the invention may be produced using routine genetic engineering 
methods and cloning and expression vectors that are well known in the art. Nucleic acids 
encoding the ribosomal protein or mRNA binding protein to be molecularly tagged may be 
obtained using any method known in the art. The sequences for many ribosomal and 

35 mRNA binding proteins are known (see Table 2 in Section 5.2 below providing GenBank 
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accession numbers for many human and murine ribosomal proteins). Nucleic acids may be 
obtained, for example, by PGR using oligonucleotide primers based upon the published 
sequences. Other related ribosomal and mRNA binding proteins (for example from other 
species) may be obtained by low, medium or high stringency hybridization of appropriate 
5 nucleic acid libraries using the ribosomal or mRNA binding protein in hand as a probe. The 
nucleic acids encoding the desired ribosomal or mRNA binding protein may then be 
incorporated into a nucleic acid vector either appropriate for additional molecular 
manipulations and/or for incorporation and expression in the host cells of interest. The 
nucleotide sequences encoding the peptide tag may likewise be obtained using methods well 
1 0 known in the art. For example, if the tag is fairly short, a nucleic acid encoding the tag and 
appropriate for generating a fusion protein with the ribosomal or mRNA binding protein 
may be constructed using oligonucleotides to form the double stranded nucleic acid 
encoding the peptide tag. The synthetic nucleic acid may then be cloned and used for 
generating fusion proteins with ribosomal proteins or mRNA binding proteins. 
15 In certain embodiments, a nucleic acid molecule encoding a molecularly tagged 

ribosomal protein is intended for a particular expression system, in which the codon 
frequencies reflect the tRNA frequencies of the host cell or organism in which the protein is 
expressed. Codon optimization allows for maximum protein expression by increasing the 
translational efficiency of a gene of interest. Codon optimization is a standard component 

20 of custom gene design, and may be obtained from commercial service providers (e.g., 
Aptagen, Inc., Herndon, VA; Integrated DNA Technologies, Skokie, IL). 

The nucleic acid encoding a molecularly tagged ribosomal protein may be a 
synthetic nucleic acid in which the codons have been optimized for increased expression in 
the host cell in which it is produced. The degeneracy of the genetic code permits variations 

25 of the nucleotide sequence, while still producing a polypeptide having the identical amino 
acid sequence as the polypeptide encoded by the native DNA sequence. The frequency of 
individual synonymous codons for amino acids varies widely from genome to genome 
among eukaryotes and prokaryotes. The overall expression levels of individual genes may 
be regulated by differences in codon choice, which modulates peptide elongation rates. 

30 Native codons may be exchanged for codons of highly expressed genes in the host cells. 
For instance, the nucleic acid molecule can be optimized for expression of the encoded 
protein in bacterial cells (e.g., E. coli\ yeast (e.g., Pichid), insect cells (e.g., Drosophild), or 
mammalian cells or animals (e.g., human, sheep, bovine or mouse cells or animals). 

Restriction enzyme sites critical for gene synthesis and DNA manipulation can be 

35 preserved or destroyed to facilitate nucleic acid and vector construction and expression of 
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the encoded protein. In constructing the synthetic nucleic acids of the invention, it may be 
desirable to avoid sequences that may cause gene silencing. The codon optimized sequence 
is synthesized and assembled, and inserted into an appropriate expression vector using 
conventional techniques well known to those of skill in the art. 

5 In a preferred embodiment, a synthetic nucleic acid encoding a molecularly tagged 

ribosomal protein comprises at least one codon substitution in which non-preferred or less 
preferred codon in the natural gene encoding the protein has been replaced by a preferred 
codon encoding the same amino acid. The relative frequency of use for each codon can 
vary significantly between species, although certain codons are infrequently used across 

10 species (Zhang et al, 1991, Low-usage codons in Escherichia coli, yeast, fruit fly, and 

primates. Gene, 105:61-72). For instance in humans the preferred codons are: Ala (GCC); 
Arg (CGC); Asn (AAC); Asp (GAC); Cys (TGC); Gin (CAG); Gly (GGC); His (CAC); He 
(ATC); Leu (CTG); Lys (AAG); Pro(CCC); Phe (TTC); Ser (AGC); Thr (ACC); Tyr 
(TAG); and Val (GTG). Less preferred codons are: Gly (GGG); lie (ATT); Leu (CTC); Ser 

15 (TCC); Val (GTC); and Arg (AGG). All codons that do not fit the description of preferred 
codons or less preferred codons are non-preferred codons. 

In general, the degree of preference of a particular codon is indicated by the 
prevalence of the codon in highly expressed genes. Codon preference for highly expressed 
human genes are as indicated in Table 1. For example, "ATC" represents 77% of the He 

20 codons in highly expressed mammalian genes and is the preferred He codon; "ATT" 
represents 1 8% of the He codons in highly expressed mammalian genes and is the less 
preferred He codon. The sequence "ATA" represents only 5% of the He codons in highly 
expressed human genes and is a non-preferred He codon. Replacing a codon with another 
codon that is more prevalent in highly expressed human genes will generally increase 

25 expression of the gene in mammalian cells. Accordingly, the invention includes replacing a 
less preferred codon with a preferred codon as well as replacing a non-preferred codon with 
a preferred or less preferred codon. 

In a particularly preferred embodiment, the nucleic acid has been optimized for 
expression of the encoded protein in human or mammalian cells or organisms. 

30 
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Table 1. Codon Frequency (Percentage) in highly expressed human genes 
Ala 

5 GC C 53 

T 17 

A 13 

G 17 

Arg 

10 CG C 37 

T 7 

A 6 

G 21 

AG A 10 

15 G 18 

Asn 

AA C 78 

T 22 

Asp 

20 GA C 75 

T 25 

Leu 

CT C 26 

T 5 

25 A3 

G 58 

TT A 2 

G 6 

Lys 

30 AA A 18 

G 82 

Pro 

CC C 48 

T 19 

35 A 16 
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Phe 
5 Cys 
Gin 

10 

Glu 
Gly 

15 

His 

20 

He 

25 

Ser 

30 

Thr 

35 



G 17 

TT C 80 

T 20 

TG C 68 

T 32 

CA A 12 

G 88 

GA A 25 

G 75 

GG C 50 

T 12 

A 14 

G 24 

CA C 79 

T 21 

AT C 77 

T 18 

A 5 

TC C 28 

T 13 

A 5 

G 9 

AG C 34 

T 10 

AC C 57 

T 14 
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A 


14 






G 


15 


Tyr 










TA 


C 


74 


5 




T 


26 


vai 










GT 


C 


25 






T 


7 






A 


5 


10 




G 


64 



In particular embodiments, the invention provides fusion proteins (including isolated 
or purified fusion proteins) containing all or a functional portion of a ribosomal protein or 
mRNA binding protein and a peptide tag, as described above, as well as intact ribosomes 

1 5 and complexes of mRNA and mRNA binding protein (including isolated and purfied intact 
ribosomes and complexes). The invention further provides nucleic acids comprising 
nucleotide sequences encoding the ribosomal and mRNA binding protein fusions with 
peptide tags of the invention, vectors containing these nucleic acids, and host cells 
containing nucleic acids encoding the ribosomal and mRNA binding protein fusion proteins 

20 of the invention. 

5.2. SELECTION OF RIBOSOMAL PROTEIN FOR TAGGING 

Any ribosomal protein or mRNA binding protein may be molecularly tagged for use 
in the methods of the invention. The ribosome containing the tagged protein should bind 

25 mRNA and, preferably, also translate the mRNA into protein, and the peptide tag in the 
intact ribosome should be accessible to the corresponding isolation reagent. Likewise, if an 
mRNA binding protein is used, the tagged mRNA binding protein should bind mRNA, and 
the peptide tag should be accessible to the corresponding isolation reagent. Accordingly, 
selection of an appropriate ribosomal protein for tagging can be based upon accessibility to 

30 affinity reagents such as antibodies against N- and C-termini or other portions of the 
proteins in intact ribosomes (Syu WJ, Kahan L. Both ends of Escherichia coli ribosomal 
protein SI 3 are immunochemically accessible in situ. J Protein Chem. 1992 
Jun;l l(3):225-30; reviewed in Syu WJ, Kahan B, Kahan L. Detecting immunocomplex 
formation in sucrose gradients by enzyme immunoassay: application in determining epitope 

35 accessibility on ribosomes. Anal Biochem. 1991 Jul;196(l):174-7). However, accessibility 
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does not imply that once tagged, the ribosomal protein will function appropriately. One 
assay of proper function of a tagged variant is the determination, via immunohistochemistry, 
that the tagged protein displays expected subcellular localization when expressed in cultured 
cells. The determination that the tag appears in a preparation of polysomes isolated from 

5 transfected cells is an indication that ribosomal function is not greatly perturbed by the 
incorporation of the tagged protein into the organelle. See e.g., Rosorius et aL, 2000, 
Human Ribosomal Protein L5 Contains Defined Nuclear Localization and Export Signals, J. 
Biol. Chem. 275(16): 12061-12068, and Russo et al, 1997, Different Domains Cooperate 
to Target the Human Ribosomal L7a Protein to the Nucleus and to the Nucleoli, J. Biol. 

10 Chem. 272(8): 5229-5235, both of which are hereby incorporated by reference in their 
entireties. 

More thorough evaluations of any possible perturbation of ribosomal function 
involves comparisons of cellular physiology in transfected and untransfected cells. For 
example, comparisons of relative protein or mRNA abundances in transfected and 
15 untransfected cells would be such measures of cellular physiology. An appropriate 

ribosomal protein will be one which, when tagged, is incorporated into ribosomes, allows 
those ribosomes to function without unduly affecting cellular physiology, and which has the 
tag positioned so as to be accessible to affinity purification reagents. 

The methods of Herfurth et al. (1995, Determination of peptide regions exposed at 
20 the surface of the bacterial ribosome with antibodies against synthetic peptides. Biol Chem 
Hoppe Seyler 376(2):81-90; which is hereby incorporated by reference in its entirety) may 
be use to determine before tagging which parts of particular ribosomal proteins are 
accessible in the intact ribosome. 

Once accessibility is determined, one can determine whether ribosomes containing 
25 the tagged riboprotein are functional using routine assays well known in the art. Analogous 
tests for accessibility of the tag in tagged mRNA binding proteins and formation and 
function of the mRNA binding protein-mRNA complex will be apparent to the skilled 
artisan for identifying and designing appropriate tagged mRNA binding proteins for use in 
the present invention. 

30 Ribosomal proteins or protein subunits or mRNA binding proteins suitable for use in 

the methods of the invention are preferably of the same species as the host cell to be 
transformed, but in certain embodiments, may be of a different species. 

Ribosomal proteins or protein subunits suitable for use in the methods of the 
invention include, but axe not limited to mouse and human ribosomal proteins in Tables 2 

35 
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and 3. In Tables 2 and 3, the GenBank accession number is followed by a description of the 
ribosomal protein as it appears in GenBank: 

5 ' ~ 

TABLE 2 
Mouse Ribosomal Proteins 

BC006068 - ribosomal protein L10, clone IMAGE: 3 593 057, mRNA 
gi|13543840|gb|BC006068.1|BC006068[13543840] 

10 

U17332 - ribosomal protein L9 (musl9) mRNA, partial cds 
gi|687603|gb|U17332.1|MMU17332[687603] 

U 173 31 - mutant ribosomal protein L9 (musl9mu) mRNA, partial cds 
1 5 gi|687601|gb|U1733 1 . 1 |MMU1733 1 [687601] 

AY043296 - ribosomal protein S3 (Rps3) gene, complete cds 
gi| 1 542 1 1 26|gb|AY043296. 1 1[1 542 1 126] 

20 BC013165 - ribosomal protein L9, clone MGC:6543 IMAGE:2655358, mRNA, complete 
cds 

gi|15341947|gb|BC013165.1|BC013165[15341947] 

BC012641 - ribosomal protein SI 1, clone MGC:13737 IMAGE:4019309, mRNA, 
25 complete cds 

gi|15215035|gb|BC012641.1|BC012641[15215035] 

NM_021338 - ribosomal protein L35a (Rpl35a), mRNA 
gi|15042946|reflNM_021338.2|[15042946] 

30 

Y16430 - mRNA for ribosomal protein L35a 

gi| 1 5024263 |emb| Yl 6430.2|MMY1 6430[1 5024263] 

AF043285 - ribosomal protein S7 (rpS7) gene, complete cds 
35 gi|2811283|gb|AF043285.1|AF043285[2811283] 
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BC010721 - ribosomal protein S3, clone MGC:6565 IMAGE:281 1930, mRNA, complete 
cds 

gi|14715106|gb|BC010721.1|BC010721[14715106] 

5 BC010604 - ribosomal protein S6, clone MGC:6573 IMAGE:3481640, mRNA, complete 
cds 

gi|14714896|gb|BC010604.1|BC010604[14714896] 

BC009100 - ribosomal protein S4, X-linked, clone MGC:6575 IMAGE:3482299, 
10 mRNA, complete cds 

gi|14318605|gb|BC009100.1|BC009100[14318605] 

BC005790 - ribosomal protein L5, clone IMAGE:281 1648, mRNA 
gi|14710611|gb|BC005790.1|BC005790[14710611] 

15 

BC008223 - ribosomal protein L31, clone MGC:6449 IMAGE:2599150, mRNA, 
complete cds 

gi| 141 98320|gb|BC008223. 1 |BC008223 [141 98320] 

20 BC007139 - ribosomal protein L22, clone MGC:6121 IMAGE:3487607, mRNA, 
complete cds 

gi|13938045|gb|BC007139.1|BC007139[13938045] 

BC003896 - ribosomal protein L17, clone MGC:6758 IMAGE: 3 5943 73, mRNA, 
25 complete cds 

gi| 13278089|gb|BC003896. 1 |BC003896[1 3278089] 

BC003829 - laminin receptor 1 (67kD, ribosomal protein SA), clone MGC:6243 
IMAGE:3600738, mRNA, complete cds 
30 gi| 1 3277920|gb|BC003829. 1 |BC003829[1 3277920] 

BC002145 - ribosomal protein S23, clone MGC:7260 IMAGE:3484753, mRNA, 
complete cds 

gi|12805350|gb|BC002145.1|BC002145[12805350] 

35 
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BC0021 10 - ribosomal protein L24, clone MGC:6606 IMAGE:3488279, mRNA, 
complete cds 

gi|12805288|gb|BC002110.1|BC002110[12805288] 

5 BC002088 - ribosomal protein S25, clone MGC:6338 IMAGE:3487037, mRNA, 
complete cds 

gi|12805250|gb|BC002088.1|BC002088[12805250] 

BC002062 - ribosomal protein L29, clone MGG6127 IMAGE:3590425, mRNA, 
10 complete cds 

gi|12805206|gb|BC002062.1|BC002062[12805206] 

BC002060 - ribosomal protein L30, clone MGC:61 14 IMAGE:34893 1 1 , mRNA, 
complete cds 

15 gi|12805202|gb|BC002060.1|BC002060[12805202] 

BC002044 - ribosomal protein SI 7, clone MGC:6030 IMAGE:3484265, mRNA, 
complete cds 

gi| 1 2805 1 70|gb|BC002044. 1 |BC002044[12805 1 70] 

20 

BC002014 - ribosomal protein S7, clone MGC:5812 IMAGE:3484169, mRNA, complete 
cds 

gi|12805114|gb|BC002014.1|BC002014[12805114] 

25 AF374195 - ribosomal protein L6 (Rpl6) gene, complete cds 
gi|14210105|gb|AF374195.1|AF374195[14210105] 

NM_01 1292 - ribosomal protein L9 (Rpl9), mRNA 
gi|14149646|reflNM_011292.1|[14149646] 

30 

AF227523 - ribosomal protein L3 (Rpl3) gene, partial cds 
gi|13383337|gb|AF227523.1|AF227523[13383337] 

NM_01 1289 - ribosomal protein L27 (Rpl27), mRNA 
35 gi|8567399|reflNM_011289.1|[8567399] 
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NM_0 19647 - ribosomal protein L21 (Rpl21), mRNA 
gi|9789992|reflNM_019647.1|[9789992] 

NM_01 1029 - lamiiiin receptor 1 (67kD, ribosomal protein SA) (Lamrl), mRNA 
5 gi|6754967|reflNM_01 1029.1 1 [6754967] 

NM_023133 - ribosomal protein S19 (Rpsl9), mRNA 
gi| 129635 1 0|reflNM_023 1 33. 1 1[129635 1 0] 

10 NM_022891 - ribosomal protein L23 (Rpl23), mRNA 
gi|12584985|reflNM_022891.1|[12584985] 

AF287271 - ribosomal protein L23 (Rpl23) mRNA, complete cds 
gi|950228 1 |gb| AF28727 1 . 1 |AF28727 1 [950228 1] 

15 

AF158022 - ribosomal protein L23 (Rpl23) gene, complete cds 
gi|5354204|gb|AF158022.1|AF158022[5354204] 

NM_018853 - ribosomal protein, large, PI (Rplpl), mRNA 
20 gi|92565 1 8|ref]NM_0 1 8853. 1 1 [92565 1 8] 

NM_020600 - ribosomal protein S14 (Rpsl4), mRNA 
gi|10181111|reflNM_020600.1|[10181111] 

25 NM_0 19865 - ribosomal protein L44 (Rpl44), mRNA 
gi|9845294|ref]NM_019865.1|[9845294] 

NM_0 18730 - ribosomal protein L36 (Rpl36), mRNA 
gi|9055321|ref]NM_018730.1|[9055321] 

30 

NM_0 16959 - ribosomal protein S3a (Rps3a), mRNA 
gi|8394217|reflNM_016959.1|[8394217] 

NM_016738 - ribosomal protein L13 (Rpll3), mRNA 
35 gi|7949 1 26|reflNM_0 1 673 8. 1 1 [7949 1 26] 
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NM_0 13765 - ribosomal protein S26 (Rps26), mRNA 
gi|7305446|reflNM_013765.1|[7305446] 

NM_013647 - ribosomal protein S16 (Rpsl6), mRNA 
5 gi|7305444|ref]NM_013647.1|[7305444] 

NM_0 13721 - ribosomal protein L7a (Rpl7a), mRNA 
gi|7305442|ref]NM_01 372 1 . 1 1[7305442] 

10 NM_0 13762 - ribosomal protein L3 (Rpl3), mRNA 
gi|73 05440|reflNM_0 13762.1] [7305440] 

NM_009438 - ribosomal protein LI 3a (Rpll3a), mRNA 
gi|7110730|reflNM_009438.1|[71 10730] 

15 

NM_01 1300 - ribosomal protein S7 (Rps7), mRNA 
gi|6755375|ref!NM_011300.1|[6755375] 

NM_0 12052 - ribosomal protein S3 (Rps3), mRNA 
20 gi|6755371|reflNM_012052.1|[6755371] 

NM_01 1297 - ribosomal protein S24 (Rps24), mRNA 
gi|6755369|reflNM_011297.1|[6755369] 

25 NM_0 1 1 296 - ribosomal protein S 1 8 (Rps 1 8), mRNA 
gi|6755367|ref|NM_01 1296.1 1[6755367] 

NM_01 1295 - ribosomal protein S12 (Rpsl2), mRNA 
gi|6755365|reflNM_011295.1|[6755365] 

30 

NM_012053 - ribosomal protein L8 (Rpl8), mRNA 
gi|6755357|ref]NM_012053.1|[6755357] 

NM_01 1291 - ribosomal protein L7 (Rpl7), mRNA 
35 gi|6755355|reflNM_011291.1|[6755355] 
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NM_01 1290 - ribosomal protein L6 (Rpl6), mRNA 
gi|6755353|reflNM_011290.1|[6755353] 

NM_01 1287 - ribosomal protein L10A (RpllOa), mRNA 
5 gi|6755349|ref|NM_011287.1|[6755349] 

NM_009098 - ribosomal protein S8 (Rps8), mRNA 
gi|66778 1 2|ref]NM_009098. 1 1 [66778 1 2] 

10 NM_009096 - ribosomal protein S6 (Rps6), mRNA 
gi|6677808 jref]NM_009096. 1 1[6677808] 

NM_009095 - ribosomal protein S5 (Rps5), mRNA 
gi|6677806|ref|NM_009095.1|[6677806] 

15 

NM_009094 - ribosomal protein S4, X-linked (Rps4x), mRNA 
gi|6677804|ref]NM_009094.1|[6677804] 

NM_009093 - ribosomal protein S29 (Rps29), mRNA 
20 gi|6677802|ref|NM_009093 . 1 1 [6677802] 

NM_009092 - ribosomal protein S17 (Rpsl7), mRNA 
gi|6677800|reflNM_009092.1|[6677800] 

25 NM_00909 1 - ribosomal protein S 1 5 (Rpsl 5), mRNA 
gi|6677798|reflNM_009091.1|[6677798] 

NM_009084 - ribosomal protein L37a (Rpl37a), mRNA 
gi|6677784|reflNM_009084. 1 1[6677784] 

30 

NM_009083 - ribosomal protein L30 (Rpl30), mRNA 
gi|6677782|ref]NM_009083.1|[6677782] 

NM_009082 - ribosomal protein L29 (Rpl29), mRNA 
35 gi|6677780|reflNM_009082.1 1 [6677780] 
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NM_009081 - ribosomal protein L28 (Rpl28), mRNA 
gi|6677778|ref|NM_00908 1 . 1 1[6677778] 

NM_009080 - ribosomal protein L26 (Rpl26), mRNA 
5 gi|6677776|reflNM_009080. 1 1[6677776] 

NM_009079 - ribosomal protein L22 (Rpl22), mRNA 
gi|6677774|reflNM_009079.1|[6677774] 

1 0 NM_009078 - ribosomal protein LI 9 (Rpl 1 9), mRNA 
gi|6677772|ref|NM_009078. 1 1[6677772] 

NM_009077 - ribosomal protein LI 8 (Rpl 18), mRNA 
gi|6677770|ref|NM_009077. 1 1[6677770] 

15 

NM_009076 - ribosomal protein L12 (Rpll2), mRNA 
gi|6677768|ref|NM_009076.1|[6677768] 

Y1243 1 - mRNA for ribosomal protein S5 
20 gi|3717977|emb|Y12431.1|MMRPS5[3717977] 

AF236069 - ribosomal protein L29 gene, complete cds 
gi|7800211|gb|AF236069.1|AF236069[7800211] 

25 AF283559 - ribosomal protein S2 mRNA, complete cds 
gi|10179939|gb|AF283559.1|AF283559[10179939] 

AB037665 - rpl38 mRNA for ribosomal protein L38, complete cds 
gi|9650959|dbjlAB037665.1|AB037665[9650959] 

30 

X83590 - mRNA for ribosomal protein L5, 3 'end 
gi|61 9503|emb|X83590. 1 |MMRPL5 [6 1 9503] 

AF260271 - 60S ribosomal Protein L9 mRNA, complete cds 
35 gi|7862 1 7 1 |gb| AF26027 1 . 1 1 AF26027 1 [7862 1 7 1 ] 
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AF216207 - ribosomal protein S19 (Rpsl9) gene, complete cds 
gi|7648817|gb|AF216207.1|AF216207[7648817] 

AF2 14527 - ribosomal protein L27 (RPL27) mRNA, complete cds 
5 gi|6708473|gb|AF214527.1|AF214527[6708473] 

AB020237 - gene for ribosomal protein L27A, complete cds 
gi|4760603|dbj|AB020237.1|AB020237[4760603] 

1 0 AF09 1 5 1 1 - ribosomal protein L8 (Rpl8) gene, partial cds 
gi|3851578|gb|AF091511.1|AF091511[3851578] 

U89419 - strain BALB/c 60S acidic ribosomal protein PO mRNA, partial cds 
gi|3642675|gb|U894 19.1 |MMU894 1 9[3642675] 

15 

U8941 8 - strain BALB/c ribosomal protein S2 (LLRep3) mRNA, partial cds 
gi|3642670|gb|U89418.1|MMU89418[3642670] 

U89417 - strain BALB/c ribosomal protein L3 mRNA, partial cds 
20 gi|3642668|gb|U89417.1|MMU89417[3642668] 

U89414 - strain BALB/c ribosomal protein S3 mRNA, partial cds 
gi|3 642662|gb|U894 14 . 1 |MMU894 14 [3 642662] 

25 U67771 - ribosomal protein L8 (RPL8) mRNA, complete cds 
gi| 1 527 1 77|gb|U6777 1 . 1 |MMU6777 1 [1 527 1 77] 

K02060 - ribosomal protein L32-3A (3A) gene, complete cds 
gi|3228365|gb|K02060.1|MUSRPL3A[3228365] 

30 

Y08307 - mRNA for ribosomal protein S14 

gi| 1 565267|emb| Y08307. 1 |MMMRPS 1 4[ 1 565267] 

U78085 - ribosomal protein S5 mRNA, complete cds 
35 gi|1685070|gb|U78085.1|MMU78085[1685070] 
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D25213 - rpS17 inRNA for ribosomal protein SI 7, complete cds 
gi|893394|dbj|D25213.1|MUSRPS17[893394] 

U93864 - ribosomal protein SI 1 mRNA, complete cds 
5 gi| 1 93 8405|gb|U93864. 1 |MMU93 864[ 1 938405] 

U93863 - ribosomal protein L21 mRNA, complete cds 
gi| 1 93 8403 |gb|U93 863 . 1 |MMU93 863 [ 1 93 8403] 

10 U93862 - ribosomal protein L41 mRNA, complete cds 
gi| 1 93 840 1 |gb|U93 862. 1 |MMU93 862[1 93 840 1 ] 

M62952 - ribosomal protein LI 9, complete cds 
gi| 1 98642|gb|M62952. 1 |MUSL1 9RP[ 1 98642] 

15 

L31609 - clone mcori-lck9, S29 ribosomal protein mRNA, complete cds 
gi| 1 2204 1 7|gb|L3 1 609. 1 |MUSS29RP[ 1 2204 1 7] 

U67770 - ribosomal protein S26 (RPS26) mRNA, complete cds 
20 gi| 1 527 1 75|gb|U67770. 1 |MMU67770[1 527 1 75] 

X54067 - SURP-3 gene for ribosomal protein L7a (rpL7a) 
gi|54209|emb|X54067. 1 |MMSURF3 [54209] 

25 Z32550 - gene for ribosomal protein L35a 
gi|563529|emb|Z32550.1|MMRPL35[563529] 

X73829 - mRNA for ribosomal protein S8 
gi|3 1 3297|emb|X73829. 1 |MMRPS8 [3 1 3297] 

30 

X73331 - mRNA for ribosomal protein L37a 
gi|312413|emb|X73331.1|MMRP37A[312413] 

X60289 - mRNA for ribosomal protein S24 
35 gi|3 1 1296|emb|X60289.1|MMRPS24[3 1 1296] 
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Y00348 - mRNA for ribosomal protein S6 
gi|54009|emb|Y00348.1|MMRPS6[54009] 

XI 5962 - mRNA for ribosomal protein S12 
5 gi|54005|emb|X15962.1|MMRPS12[54005] 

X74856 - L28 mRNA for ribosomal protein L28 
gi|488834|emb|X74856.1|MMRNAL28[488834] 

10 X76772 - mRNA for ribosomal protein S3 
gi|439521 |emb|X76772. 1|MMRIBPS3[439521] 

X57960 - mRNA for ribosomal protein L7 
gi|5391 l|emb|X57960.1|MMRBPRL7A[5391 1] 

15 

X57961 - mRNA for ribosomal protein L7 
gi|55488|emb|X57961 .1 |MRBPRL7B[55488] 

X75895 - mRNA for ribosomal protein L36 
20 gi|443801|emb|X75895.1|MML36[443801] 

U28917 - 60S ribosomal protein (A52) mRNA, complete cds 
gi|899444|gb|U2891 7. 1 |MMU289 1 7[899444] 

25 M73436 - ribosomal protein S4 (Rps4) mRNA, complete CDS 
gi|200863|gb|M7343 6. 1 |MUSRSP4[200863] 

L24371 - clone FVB41, ribosomal protein S4 gene, partial cds 
gi|4023 1 0|gb|L2437 1 . 1 |MUSRPS4B [4023 1 0] 

30 

M77296 - ribosomal protein S4 (Rps4) gene, partial cds 
gi|200798|gb|M77296. 1 |MUSRPS4A[200798] 

M29016 - ribosomal protein L7 (rpL7) mRNA, 5' end 
35 gi|200786|gb|M2901 6. 1 |MUSRPL7R[200786] 
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M29015 - ribosomal protein L7 (rpL7) gene, complete cds 
gi|200784|gb|M290 15.1 |MUSRPL7 A[200784] 

M23453 - ribosomal protein L32' (rpL32') gene, complete cds 
5 gi|200778|gb|M23453 . 1 |MUSRPL32A[200778] 

L04128 - ribosomal protein LI 8 (rpL18) mRNA, complete cds 
gi|398049|gb|L04128.1|MUSRPL18A[398049] 

10 L04280 - ribosomal protein (Rpll2) mRNA, complete cds 
gi|398047|gb|L04280.1|MUSRPL12A[398047] 

M35397 - ribosomal protein L32' (L32 1 ) gene, complete cds 
gi|200773|gb|M35397.1|MUSRP32A[200773] 

15 

M85235 - ribosomal protein mRNA, complete cds 
gi|200769|gb|M85235.1|MUSRP[200769] 

Ul 1248 - C57BL/6J ribosomal protein S28 mRNA, complete cds 
20 gi|508265|gb|Ul 1248. 1 |MMU1 1248 [508265] 

M76762 - ribosomal protein (Ke-3) gene, exons 1 to 5, and complete cds 
gi| 1 98577|gb|M76762. 1 |MUSKE3 A[l 98577] 

25 M76763 - ribosomal protein (Ke-3) mRNA, complete cds 
gi| 1 98579|gb|M76763 . 1 |MUSKE3B [1 98579] 

Ml 1408 - S16 ribosomal protein gene, complete cds 
gi|435544|gb|Ml 1408. 1 |MUSRPS 1 6[435544] 

30 

K02928 - ribosomal protein L30 gene, complete cds 
gi|435126|gb|K02928.1|MUSRPL30[435126] 
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TABLE 3 
Human Ribosomal Proteins 

5 NM_000994 - ribosomal protein L32 (RPL32), mRNA 
gi|15812220|rei]NM_000994.2|[15812220] 

NM_000993 - ribosomal protein L31 (RPL31), mRNA 
gi|15812219MNM_000993.2|[15812219] 

10 

NM_000989 - ribosomal protein L30 (RPL30), mRNA 
gi| 1 5 8 1 22 1 8|reflNM_000989.2| [1 5 8 1 22 1 8] 

NM_001006 - ribosomal protein S3 A (RPS3A), mRNA 
15 gi|15718688MNM_001006.2|[15718688] 

NM_001005 - ribosomal protein S3 (RPS3), mRNA 
gi|15718686|reflNM_001005.2|[15718686] 

20 NM_006013 - ribosomal protein L10 (RPL10), mRNA 
gi|15718685|ref]NM_006013.2|[15718685] 

NM_002954 - ribosomal protein S27a (RPS27A), mRNA 
gi| 1 543 1 307|ref]NM_002954.21[l 543 1 307] 

25 

NM_00101 1 - ribosomal protein S7 (RPS7), mRNA 
gi|15431308|reflNM_001011.2|[15431308] 

NM_033301 - ribosomal protein L8 (RPL8), transcript variant 2, mRNA 
30 gi|15431305|reflNM_033301.1|[15431305] 

NM_000973 - ribosomal protein L8 (RPL8), transcript variant 1, mRNA 
gi| 1 543 1 3 04|reflNM_000973 .2| [1 543 1 3 04] 

35 NM_000661 - ribosomal protein L9 (RPL9), mRNA 
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gi|15431302|reflNM_000661.2|[15431302] 

NM_000971 - ribosomal protein L7 (RPL7), mRNA 
gi| 1 543 1 3 00|ref]NM_00097 1 .2| [ 1 543 1 300] 

5 

NM_000980 - ribosomal protein LI 8a (RPL1 8A), mRNA 
gi|15431299|reflNM_000980.2|[15431299] 

NM_000979 - ribosomal protein LI 8 (RPL18), mRNA 
10 gi|l 543 1298|ref|NM_000979.2|[l 543 1298] 

NM_000977 - ribosomal protein LI 3 (RPL13), transcript variant 1, mRNA 
gi| 1 543 1296|reflNM_000977.2|[l 543 1296] 

1 5 NM_03325 1 - ribosomal protein LI 3 (RPL1 3), transcript variant 2, mRNA 
gi| 1 543 1294|reflNM_03325 1 .1|[1 543 1294] 

NM_002948 - ribosomal protein LI 5 (RPL15), mRNA 
gi|l 543 1292|reflNM_002948.2|[l 543 1292] 

20 

NM_000976 - ribosomal protein LI 2 (RPL12), mRNA 
gi| 1 543 1291 |ref|NM_000976.2|[l 543 1291] 

NM_000975 - ribosomal protein LI 1 (RPL1 1), mRNA 
25 gi|15431289|ref]NM_000975.2|[15431289] 

NM_007104 - ribosomal protein LlOa (RPL10A), mRNA 
gi|15431287|ref]NM_007104.3|[15431287] 

30 NM_032241 - ribosomal protein L10 (RPL10), mRNA 
gi|14149953|reflNM_032241.1|[14149953] 

NM_001012 - ribosomal protein S8 (RPS8), mRNA 
gi|4506742|reflNM_00 1012. 1 1[4506742] 

35 
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XM_0 12407 - ribosomal protein L9 (RPL9), mRNA 
gi|15321503|reflXM_012407.4|[15321503] 

XM_053465 - ribosomal protein L9 (RPL9), mRNA 
5 gi|15321501|ref]XM_053465.1|[15321501] 

XM_053100 - ribosomal protein LI 3 (RPL13), mRNA 
gi|15317414|ref|XM_053100.1|[15317414] 

10 XM_051496 - ribosomal protein S25 (RPS25), mRNA 
gi|15314558|reflXM_051496.2|[15314558] 

XM_039216 - ribosomal protein SI 3 (RPS13), mRNA 
gi|15313667|ref|XM_039216.2|[15313667] 

15 

XM_047576 - ribosomal protein SI 5 (RPS15), mRNA 
gi| 1 53 0963 8|reflXM_047576 .2| [ 1 53 0963 8] 

XM_028963 - ribosomal protein L23 (RPL23), mRNA 
20 gi|15309255|ref|XM_028963.2|[15309255] 

XM_006026 - ribosomal protein S28 (RPS28), mRNA 
gi|15309243|reflXM_006026.5|[15309243] 

25 XM_030050 - ribosomal protein LI 7 (RPL1 7), mRNA 
gi|15306618|reflXM_030050.2|[15306618] 

XM_053077 - ribosomal protein SI 6 (RPS16), mRNA 
gi| 1 5306479|reflXM_053077. 1 1[1 5306479] 

30 

XM_016662 - ribosomal protein L38 (RPL38), mRNA 
gi|14785533|reflXM_016662.2|[14785533] 

XM_034464 - ribosomal protein S2 (RPS2), mRNA 
35 gi|14779902|ref|XM_034464.1|[14779902] 
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XM_007920 - ribosomal protein L3-like (RPL3L), mRNA 
gi|14779893|reflXM_007920.4|[14779893] 

XM_009998 - ribosomal protein L3 (RPL3), mRNA 
5 gi|14779001|reflXM_009998.4|[14779001] 

XM_039345 - ribosomal protein L3 (RPL3), mRNA 
gi|14778998|ref|XM_039345.1|[14778998] 

10 XM_039344 - ribosomal protein L3 (RPL3), mRNA 
gi| 1 4778996|reflXM_03 9344. 1 1 [ 1 4778996] 

XM_039346 - ribosomal protein L3 (RPL3), mRNA 
gi|14778994|ref]XM_039346.1|[14778994] 

15 

XM_047467 - ribosomal protein LI 3 (RPL13), mRNA 
gi| 14776722|ref|XM_047467. 1 1[14776722] 

XM_047464 - ribosomal protein LI 3 (RPL13), mRNA 
20 gi|14776717|ref]XM_047464.1|[14776717] 

XM_047468 - ribosomal protein LI 3 (RPL13), mRNA 
gi|14776715|ref|XM_047468.1|[14776715] 

25 XM_047465 - ribosomal protein LI 3 (RPL1 3), mRNA 
gi| 1477671 l|ref|XM_047465.1|[1477671 1] 

XM_027368 - ribosomal protein SI 5a (RPS15A), mRNA 
gi|14774916|reflXM_027368.1|[14774916] 

30 

XM_027367 - ribosomal protein S15a (RPS15A), mRNA 
gi|14774912|reflXM_027367.1|[14774912] 

XM_044693 - ribosomal protein L26 (RPL26), mRNA 
35 gi| 14774237|reflXM_044693 . 1 1[14774237] 
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XM_051497 - ribosomal protein S25 (RPS25), mRNA 
gi|14774084|reflXM_051497.1|[14774084] 

XM_039215 - ribosomal protein S13 (RPS13), mRNA 
5 gi| 1 4772983 |ref|XM_03 92 1 5 . 1 1 [ 1 4772983] 

XM_032124 - ribosomal protein L27 (RPL27), mRNA 
gi|14772981|reflXM_032124.1|[14772981] 

10 XM_008208 - ribosomal protein L27 (RPL27), mRNA 
gi| 1 4772978|ref]XM_008208 .4| [ 1 4772978] 

XM_006388 - ribosomal protein S13 (RPS13), mRNA 
gi|14772975|ref|XM_006388.5|[14772975] 

15 

XM_050589 - ribosomal protein S9 (RPS9), mRNA 
gi|14769524|ref|XM_050589.1|[14769524] 

XM_048412 - region containing hypothetical protein FLJ23544; ribosomal protein 
20 LI 0; ribosomal protein LI 0; ribosomal protein LI 0 (LOC88324), mRNA 
gi|14768370|reflXM_048412.1|[14768370] 

XM_048415 - region containing hypothetical protein FLJ23544; ribosomal protein 
L10; ribosomal protein L10; ribosomal protein L10 (LOC88324), mRNA 
25 gi|14768366|ref|XM_048415.1|[14768366] 

XM_038593 - ribosomal protein LI 8a (RPL18A), mRNA 
gi|14766237|reflXM_038593.1|[14766237] 

30 XM_045500 - Finkel-Biskis-Reilly murine sarcoma virus (FBR-MuSV) ubiquitously 
expressed (fox derived); ribosomal protein S30 (FAU), mRNA 
gi|14765886|ref|XM_045500.1|[14765886] 

XM_006522 - Finkel-Biskis-Reilly murine sarcoma virus (FBR-MuSV) ubiquitously 
35 expressed (fox derived); ribosomal protein S30 (FAU), mRNA 
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gi|14765881|reflXM_006522.4|[14765881] 

XM_0461 12 - ribosomal protein SI 6 (RPS16), mRNA 
gi|14764309|ref|XM_046112.1|[14764309] 

5 

XM_017838 - ribosomal protein L27a (RPL27A), mRNA 
gi| 1 4763277|ref]XM_0 1 783 8 .2| [ 1 4763277] 

XM_044022 - ribosomal protein S4, X-linked (RPS4X), mRNA 
10 gi|14758953|reflXM_044022.1|[14758953] 

XM_044024 - ribosomal protein S4, X-linked (RPS4X), mRNA 
gi|14758950|reflXM_044024.1|[14758950] 

15 XM_044025 - ribosomal protein S4, X-linked (RPS4X), mRNA 
gi|14758939|reflXM_044025.1|[14758939] 

XM_050942 - ribosomal protein L6 (RPL6), mRNA 
gi|14758187|ref]XM_050942.1|[14758187] 

20 

XM_050943 - ribosomal protein L6 (RPL6), mRNA 
gi|14758163|ref|XM_050943.1|[14758163] 

XM_0 16828 - ribosomal protein L44 (RPL44), mRNA 
25 gi|14757899|ref|XM_016828.2|[14757899] 

XM_035105 - ribosomal protein L7a (RPL7A), mRNA 
gi|14735036|ref|XM_035105.1|[14735036] 

30 XM_016869 - ribosomal protein L26 homolog (LOC51 121), mRNA 
gi|14723097|reflXM_016869.2|[14723097] 

XM_0 16124 - ribosomal protein L39 (RPL39), mRNA 
gi|13651332|reflXM_016124.1|[13651332] 

35 
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XM_0181 14 - region containing hypothetical protein FLJ23544; ribosomal protein 
L10; ribosomal protein L10; ribosomal protein L10 (LOC88324), mRNA 
gi|13649125|ref]XM_018114.1|[13649125] 

5 XM_008294 - ribosomal protein LI 9 (RPL1 9), mRNA 
gi|13632268|reflXM_008294.3|[13632268] 

XM_009693 - ribosomal protein S21 (RPS21), mRNA 
gi| 1 53 04527|reflXM_009693 .3 1 [ 1 53 04527] 

10 

XM_053478 - ribosomal protein LlOa (RPL10A), mRNA 
gi|15303249|reflXM_053478.1|[15303249] 

XM_015318 - ribosomal protein S26 (RPS26), mRNA 
15 gi| 1 5303043|ref|XM_01 53 1 8.2|[15303043] 

XM_027885 - ribosomal protein L13a (RPL13A), mRNA 
gi|15302570|reflXM_027885.2|[15302570] 

20 XM_007615 - ribosomal protein S17 (RPS17), mRNA 
gi|15302513|ref]XM_007615.4|[15302513] 

XM_054333 - ribosomal protein L28 (RPL28), mRNA 
gi| 1 5302226|reflXM_05433 3 . 1 1 [ 1 53 02226] 

25 

XM_031815 - ribosomal protein S20 (RPS20), mRNA 
gi|15300059|ref]XM_031815.2|[15300059] 

XM_039576 - ribosomal protein S24 (RPS24), mRNA 
30 gi|15299342|ref|XM_039576.2|[15299342] 

XM_004020 - ribosomal protein S23 (RPS23), mRNA 
gi|15297223|reflXM_004020.2|[15297223] 

35 XM_053824 - ribosomal protein L32 (RPL32), mRNA 
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gi| 1 5296504|ref|XM_053 824. 1 1[ 1 5296504] 

XM_054368 - ribosomal protein LI 1 (RPL1 1), mRNA 
gi| 1 5296209|ref|XM_054368. 1 1 [ 1 5296209] 

5 

XM_036739 - ribosomal protein S27 (metallopanstimulin 1) (RPS27), mRNA 
gi| 1 529482 1 |reflXM_03 673 9.2| [ 1 529482 1 ] 

XM_027332 - ribosomal protein L36 (RPL36), mRNA 
10 gi|14786075|reflXM_027332.1|[14786075] 

XM_027331 - ribosomal protein L36 (RPL36), mRNA 
gi|14786072|ref]XM_027331.1|[14786072] 

15 XM_027333 - ribosomal protein L36 (RPL36), mRNA 
gi| 14786069|reflXM_027333 . 1 1[14786069] 

XM 046140 - 60S ribosomal protein L30 isolog (LOC51 187), mRNA 
gi| 14785520|ref]XM_046 1 40. 1 1[ 1 4785520] 

20 

XM_046136 - 60S ribosomal protein L30 isolog (LOC51 187), mRNA 
gi| 147855 1 6|reflXM_046 1 36. 1 1[ 147855 1 6] 

XM_043287 - ribosomal protein S10 (RPS10), mRNA 
25 gi| 1478291 6|reflXM_043287. 1 1[1478291 6] 

XM_043285 - ribosomal protein S10 (RPS10), mRNA 
gi|14782914|reflXM_043285.1|[14782914] 

30 XM_049965 - ribosomal protein LI 8 (RPL1 8), mRNA 
gi|14760401|reflXM_049965.1|[14760401] 

XM_049096 - ribosomal protein S26 (RPS26), mRNA 
gi| 1475988 1 MXM_049096. 1 1[1475988 1] 

35 
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XM_015328 - ribosomal protein L41 (RPL41), mRNA 
gi|14759754|ref|XM_015328.2|[14759754] 

XM_008923 - ribosomal protein SI 1 (RPS1 1), mRNA 
5 gi|14757439|ref|XM_008923.4|[14757439] 

XM_027884 - ribosomal protein L13a (RPL13A), mRNA 
gi|14757411|ref]XM_027884.1|[14757411] 

10 XM_027886 - ribosomal protein L13a (RPL13A), mRNA 
gi|14757404|ref|XM_027886.1|[14757404] 

XM_035924 - ribosomal protein L28 (RPL28), mRNA 
gi|14757079|ref]XM_035924.1|[14757079] 

15 

XM_0 17626 - ribosomal protein S12 (RPS12), mRNA 
gi|14756487|reflXM_017626.2|[14756487] 

XM 029926 - ribosomal protein S19 (RPS19), mRNA 
20 gi|14756213|ref|XM_029926.1|[14756213] 

XM_034265 - ribosomal protein S5 (RPS5), mRNA 
gi|14755544|reflXM_034265.1|[14755544] 

25 XM_029544 - 40S ribosomal protein S27 isoform (LOC51065), mRNA 
gi| 1 4752644|ref|XM_029544. 1 1 [1 4752644] 

XM_035389 - ribosomal protein, large, PI (RPLP1), mRNA 
gi|14749908|ref]XM_035389.1|[14749908] 

30 

XM_035388 - ribosomal protein, large, PI (RPLP1), mRNA 
gi|14749900|reflXM_035388.1|[14749900] 

XM_035387 - ribosomal protein, large, PI (RPLP1), mRNA 
35 gi|14749891|reflXM_035387.1|[14749891] 
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XM_035494 - ribosomal protein L7 (RPL7), mRNA 
gi| 14749839|reflXM_03 5494. 1 1[1474983 9] 

XM_035493 - ribosomal protein L7 (RPL7), mRNA 
5 gi|14749837|reflXM_035493.1|[14749837] 

XM_035492 - ribosomal protein L7 (RPL7), mRNA 
gi|14749834|ref]XM_035492.1|[14749834] 

10 XM_052447 - ribosomal protein L29 (RPL29), mRNA 
gi| 14747560|reflXM_052447. 1 1[14747560] 

XM_052669 - ribosomal protein S29 (RPS29), mRNA 
gi|14747175|reflXM_052669.1|[14747175] 

15 

XM_044796 - ribosomal protein L3 5 (RPL35), mRNA 
gi| 1474421 8|ref]XM_044796. 1 1[1474421 8] 

XM_039575 - ribosomal protein S24 (RPS24), mRNA 
20 gi|14743725|reflXM_039575.1|[14743725] 

XM_039577 - ribosomal protein S24 (RPS24), mRNA 
gi|14743718|reflXM_039577.1|[14743718] 

25 XM_039578 - ribosomal protein S24 (RPS24), mRNA 
gi|14743713|reflXM_039578.1|[14743713] 

XM_046554 - ribosomal protein S8 (RPS8), mRNA 
gi|14742855|ref|XM_046554.1|[14742855] 

30 

XM_034712 - ribosomal protein L34 (RPL34), mRNA 
gi|14734144|ref|XM_034712.1|[14734144] 

XM_03471 1 - ribosomal protein L34 (RPL34), mRNA 
35 gi|14734139|reflXM_034711.1|[14734139] 
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XM_042550 - ribosomal protein S14 (RPS14), mRNA 
gi|14734089|ref]XM_042550.1|[14734089] 

XM_042549 - ribosomal protein S14 (RPS14), mRNA 
5 gi|14734082|reflXM_042549.1|[14734082] 

XM_042548 - ribosomal protein S14 (RPS14), mRNA 
gi|14734076|refIXM_042548.1|[14734076] 

10 XM_0 15463 - ribosomal protein L24 (RPL24), mRNA 
gi|14733795|reflXM_015463.2|[14733795] 

XM_040555 - ribosomal protein L24 (RPL24), mRNA 
gi|14733789|reflXM_040555.1|[14733789] 

15 

XM_036365 - ribosomal protein L3 1 (RPL3 1), mRNA 
gi|14728681|ref]XM_036365.1|[14728681] 

XM_017513 - ribosomal protein S27a (RPS27A), mRNA 
20 gi|14725314|reflXM_017513.2|[14725314] 

XM_028344 - ribosomal protein L5 (RPL5), mRNA 
gi|14723700|reflXM_028344.1|[14723700] 

25 XM_0 1 8268 - ribosomal protein LI 5 (RPL1 5), mRNA 
gi|14723418|reflXM_018268.2|[14723418] 

XMJ341875 - ribosomal protein LI 5 (RPL15), mRNA 
gi|14723414|reflXM_041875.1|[14723414] 

30 

XM_037459 - ribosomal protein S3A (RPS3A), mRNA 
gi|14721867|reflXM_037459.1|[14721867] 

XM_037458 - ribosomal protein S3 A (RPS3A), mRNA 
35 gi| 14721 86 1 |ref|XM_03 7458. 1 1 [14721 861] 
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XM_037454 - ribosomal protein S3A (RPS3A), mRNA 
gi|14721857|reflXM_037454.1|[14721857] 

XM_003054 - ribosomal protein L32 (RPL32), mRNA 
5 gi|13646087|reflXM_003054.4|[13646087] 

XM_0 16854 - ribosomal protein SI 8 (RPS18), mRNA 
gi|13645838|ref]XM_016854.1|[13645838] 

1 0 XM_0 1 7704 - ribosomal protein L 1 0a (RPL 1 OA), mRNA 
gi|13642762|ref]XM_017704.1|[13642762] 

XM_0 17770 - ribosomal protein L37 (RPL37), mRNA 
gi|13641596|ref|XM_017770.1|[13641596] 

15 

XM_008905 - ribosomal protein L28 (RPL28), mRNA 
gi|13630273|ref]XM_008905.3|[13630273] 

XM_007281 - ribosomal protein L36a (RPL36A), mRNA 
20 gi|12738346|ref]XM_007281.2|[12738346] 

XM_002637 - ribosomal protein L37a (RPL37A), mRNA 
gi|l 1430427|ref]XM_002637.1|[l 1430427] 

25 XM_0 10467 - ribosomal protein S4, Y-linked (RPS4Y), mRNA 
gi|13640136|ref|XM_010467.3|[13640136] 

NM_007209 - ribosomal protein L35 (RPL35), mRNA 
gi|6005859MNM_007209.1|[6005859] 

30 

NM_002952 - ribosomal protein S2 (RPS2), mRNA 
gi|15055538|ref|NM_002952.2|[15055538] 

NM_001031 - ribosomal protein S28 (RPS28), mRNA 
35 gi|15011938|ref|NM_001031.2|[15011938] 
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NM_001030 - ribosomal protein S27 (metallopanstimulin 1) (RPS27), mRNA 
gi| 1 50 1 1 937|ref|NM_00 103 0.2| [1 50 1 1 937] 

NM_001029 - ribosomal protein S26 (RPS26), mRNA 
5 gi|15011935|ref|NM_001029.2|[15011935] 

NM_001026 - ribosomal protein S24 (RPS24), transcript variant 2, mRNA 
gi| 1 49 1 6502|reflNM_00 1026.2| [149 1 6502] 

10 NM_033022 - ribosomal protein S24 (RPS24), transcript variant 1, mRNA 
gi|14916500|reflNM_033022.1|[14916500] 

NM_001025 - ribosomal protein S23 (RPS23), mRNA 
gi|14790142|reflNM_001025.2|[14790142] 

15 

NM_0 16093 - ribosomal protein L26 homolog (LOC51 121), mRNA 
gi|77058 1 2|ref]NM_0 1 6093 . 1 1 [7705 8 1 2] 

NM_0 15414 - ribosomal protein L36 (RPL36), mRNA 
20 gi|766 1 63 7|reflNM_0 1 54 14. 1 1 [766 1 63 7] 

NM_000988 - ribosomal protein L27 (RPL27), mRNA 
gi|4506622|reflNM_000988.1|[4506622] 

25 NM_000986 - ribosomal protein L24 (RPL24), mRNA 
gi|45066 1 8|reflNM_000986. 1 1 [450661 8] 

NM_003973 - ribosomal protein L14 (RPL14), mRNA 
gi|4506600|reflNM_003973 . 1 1[4506600] 

30 

NM_001024 - ribosomal protein S21 (RPS21), mRNA 
gi| 146703 85|ref|NM_00 1 024.2| [146703 85] 

NM_001028 - ribosomal protein S25 (RPS25), mRNA 
35 gi|14591916|reflNM_001028.2|[14591916] 



38 



WO 03/038049 



PCT/US02/34645 



NM_001023 - ribosomal protein S20 (RPS20), mRNA 
gi|14591915|ref|NM_001023.2|[14591915] 

NM_001022 - ribosomal protein S19 (RPS19), mRNA 
5 gi|14591914|ref|NM_001022.2|[14591914] 

NM_001021 - ribosomal protein S17 (RPS17), mRNA 
gi|14591913|ref|NM_001021.2|[14591913] 

1 0 NM_00 1 020 - ribosomal protein S 1 6 (RPS 1 6), mRNA 
gi|14591912|reflNM_001020.2|[14591912] 

NM_001018 - ribosomal protein SI 5 (RPS 15), mRNA 
gi|14591911|reflNM_001018.2|[14591911] 

15 

NM_001017 - ribosomal protein S13 (RPS 13), mRNA 
gi|14591910|reflNM_001017.2|[14591910] 

NM_000969 - ribosomal protein L5 (RPL5), mRNA 
20 gi|14591908|ref|NM_000969.2|[14591908] 

NM_000978 - ribosomal protein L23 (RPL23), mRNA 
gi|14591907|reflNM_000978.2|[14591907] 

25 NM_000985 - ribosomal protein L17 (RPL17), mRNA 
gi|14591906MNM_000985.2|[14591906] 

NM_012423 - ribosomal protein L13a (RPL13A), mRNA 
gi|14591905|ref]NM_012423.2|[14591905] 

30 

NM_001016 - ribosomal protein S12 (RPS 12), mRNA 
gi| 1 4277699|ref|NM_00 1 01 6.2|[14277699] 

NM_001015 - ribosomal protein SI 1 (RPS1 1), mRNA 
35 gi| 1 4277698|reflNM_00 1015 .2| [ 14277698] 
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NM_001019 - ribosomal protein SI 5a (RPS15A), mRNA 
gi|14165468|reflNM_001019.2|[14165468] 

NM_022551 - ribosomal protein SI 8 (RPS18), mRNA 
5 gi| 141 65467|reflNM_02255 1 .2|[141 65467] 

NM_001013 - ribosomal protein S9 (RPS9), mRNA 
gi|14141192|reflNM_001013.2|[14141192] 

1 0 NM_0056 1 7 - ribosomal protein S 1 4 (RPS 1 4), mRNA 
gi|14141191|reflNM_005617.2|[14141191] 

NM_000990 - ribosomal protein L27a (RPL27A), mRNA 
gi|14141189|reflNM_000990.2|[14141189] 

15 

NM_001009 - ribosomal protein S5 (RPS5), mRNA 
gi| 1 3 904869|reflNM_00 1 009.2| [ 1 3 904869] 

NM_001032 - ribosomal protein S29 (RPS29), mRNA 
20 gi|13904868|ref|NM_001032.2|[13904868] 

NM_001014 - ribosomal protein S10 (RPS 10), mRNA 
gi|13904867|reflNM_001014.2|[13904867] 

25 NM_000991 - ribosomal protein L28 (RPL28), mRNA 
gi| 1 3904865|ref]NM_00099 1 .2| [1 3904865] 

NM_000995 - ribosomal protein L34 (RPL34), mRNA 
gi|450663 6|ref|NM_000995 . 1 1 [450663 6] 

30 

NM_001997 - Finkel-Biskis-Reilly murine sarcoma virus (FBR-MuSV) ubiquitously 
expressed (fox derived); ribosomal protein S30 (FAU), mRNA 
gi|4503658|reflNM_001997.1|[4503658] 

35 NM_022061 - ribosomal protein LI 7 isolog (LOC63875), mRNA 
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gi|11596858|ref|NM_022061.1|[l 1596858] 

NM_021 104 - ribosomal protein L41 (RPL41), mRNA 
gi|10863874|reflNM_021104.1|[10863874] 

5 

NM_021029 - ribosomal protein L44 (RPL44), mRNA 
gi| 1 0445222|ref]NM_02 1 029. 1 1 [ 1 0445222] 

NM_016304 - 60S ribosomal protein L30 isolog (LOC51 187), mRNA 
10 gi| 1 0047 1 0 1 MNM_0 1 6304. 1 1 [ 1 0047 101] 

NM_002295 - laminin receptor 1 (67kD, ribosomal protein SA) (LAMR1), mRNA 
gi|9845501|ref]NM_002295.2|[9845501] 

1 5 NM_01 61 83 - 60S acidic ribosomal protein PO (LOC5 1 1 54), mRNA 
gi|7705 874|reflNM_0 1 6 1 83 . 1 1 [7705 874] 

NM_0 15971 - 30S ribosomal protein S7 homolog (LOC51081), mRNA 
gi|7705737|reflNM_015971.1|[7705737] 

20 

NM_0 15920 - 40S ribosomal protein S27 isoform (LOC51065), mRNA 
gi|7705705|reflNM_01 5920.1 1 [7705705] 

NM_005061 - ribosomal protein L3-like (RPL3L), mRNA 
25 gi|4826987|reflNM_005061 . 1 1[4826987] 

NM_001010 - ribosomal protein S6 (RPS6), mRNA 
gi|450673 0|reflNM_00 1 0 1 0. 1 1 [4506730] 

30 NM_001008 - ribosomal protein S4, Y-linked (RPS4Y), mRNA 
gi|4506726|ref]NM_001 008. 1 1[4506726] 

NM_001007 - ribosomal protein S4, X-linked (RPS4X), mRNA 
gi|4506724|reflNM_00 1 007. 1 1[4506724] 

35 
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NM_001004 - ribosomal protein, large P2 (RPLP2), mRNA 
gi|4506670|ref|NM_001004.1|[4506670] 

NM_001003 - ribosomal protein, large, PI (RPLP1), mRNA 
5 gi|4506668|reflNM_00 1 003 . 1 1[4506668] 

NM_001002 - ribosomal protein, large, P0 (RPLP0), mRNA 
gi|4506666|ref]NM_001002.1|[4506666] 

10 NM_000972 - ribosomal protein L7a (RPL7A), mRNA 
gi|4506660|ref|NM_000972.1|[4506660] 

NM_000970 - ribosomal protein L6 (RPL6), mRNA 
gi|4506656|reflNM_000970. 1 1 [4506656] 

15 

NM_000968 - ribosomal protein L4 (RPL4), mRNA 
gi|4506652|reflNM_000968. 1 1[4506652] 

NM_001001 - ribosomal protein L36a (RPL36A), mRNA 
20 gi|4506650|ref|NM_001001.1|[4506650] 

NM_000967 - ribosomal protein L3 (RPL3), mRNA 
gi|4506648|reflNM_000967.1|[4506648] 

25 NM_001000 - ribosomal protein L39 (RPL39), mRNA 
gi|4506646|ref]NM_001 000. 1 1[4506646] 

NM_000999 - ribosomal protein L38 (RPL38), mRNA 
gi|4506644|rer]NM_000999.1| [4506644] 

30 

NM_000998 - ribosomal protein L37a (RPL37A), mRNA 
gi|4506642|reflNM_000998. 1 1 [4506642] 

NM_000997 - ribosomal protein L37 (RPL37), mRNA 
35 gi|4506640|reflNM_000997.1|[4506640] 
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NM_000996 - ribosomal protein L35a (RPL35A), mRNA 
gi|4506638|ref]NM_000996. 1 1[4506638] 

NM_000992 - ribosomal protein L29 (RPL29), mRNA 
5 gi|4506628|ref]NM_000992. 1 1 [4506628] 

NM_000987 - ribosomal protein L26 (RPL26), mRNA 
gi|4506620|reflNM_000987. 1 1 [4506620] 

10 NM_000984 - ribosomal protein L23a (RPL23A), mRNA 
gi|4506614|reflNM_000984.1|[4506614] 

NM_000983 - ribosomal protein L22 (RPL22), mRNA 
gi|45066 1 2|reflNM_000983 . 1 1 [45066 1 2] 

15 

NM_000982 - ribosomal protein L21 (gene or pseudogene) (RPL21), mRNA 
gi|4506610|reflNM_000982.1|[4506610] 

NM_000981 - ribosomal protein L19 (RPL19), mRNA 
20 gi|4506608|ref|NM_00098 1 . 1 1[4506608] 



All of the sequences in Tables 2 and 3 are incorporated by reference in their entirety. 

In preferred embodiments, the tagged ribosomal proteins are S6 or L37 ribosomal 
25 proteins, more preferably tagged with a Strep Tag peptide tag, most preferably with the 

peptide tag at the C-terminus. In another preferred embodiment, the mRNA binding protein 
is not polyA binding protein. 

5.3. ISOLATION OF RIBOSOMES 

30 Various methods exist to isolate ribosomes, particularly polysomes, from cultured 

cells and tissues from transformed organisms (see, e.g., Bommer et al, 1997, Isolation and 
characterization of eukaryotic polysomes, in Subcellular Fractionation, Graham and 
Rickwood (eds.), IRL Press, Oxford, pp. 280-285; incorporated herein by reference in its 
entirety). Preferably, the isolation method employed has the following characteristics: 

35 
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(1) Translation arresting compounds, such as emetine or cycloheximide, are added to 
arrest translation, if possible, as a pre-treatment even before homogenization. This 
prevents ribosome run-off and keeps the ribosome-mRNA complex stable, i.e., the 
ribosome remains bound to the mRNA. 

RNase inhibitors such as SUPERase*In™ RNase Inhibitor (Ambion, Austin, Texas) 
are added to buffers to maintain the integrity of the mRNA. 

After tissue or cell homogenization, total polysomes are isolated by preparing a 
post-mitochondrial supernatant in the presence of at least a high concentration salt 
buffer, e.g., 100-150 mM KC1. 

Detergent is also added to the post-mitochondrial supernatant to release 
membrane-associated polysomes from endoplasmic reticulum membranes; total 
polysomes are usually collected by centrifugation through a sucrose cushion. 

In certain embodiments, a variation of the above-described general method is used to 
isolate membrane-associated polysomes from a total pool of polysomes. This allows one to 
focus on the mRNA species encoding secreted or transmembrane proteins, which are often 
20 targets of choice for drug discovery. Various methods may be used to isolate 

membrane-associated polysomes from cultured cells and tissue, e.g., methods that employ 
differential centrifugation (Hall C, Lim L. Developmental changes in the composition of 
polyadenylated RNA isolated from free and membrane-bound polyribosomes of the rat 
forebrain, analysed by translation in vitro. Biochem J. 1981 Apr 15;196(l):327-36), 
25 rate-zonal centrifugation (Rademacher and Steele, 1 986, Isolation of undegraded free and 
membrane-bound polysomal mRNA from rat brain, J. Neurochem. 47(3):953-957), 
isopycnic centrifugation (Mechler, 1987, Isolation of messenger RNA from membrane- 
bound polysomes, Methods EnzymoL 152: 241-248), and differential extraction (Bommer et 
al, 1997, Isolation and characterization of eukaryotic polysomes, in Subcellular 
30 Fractionation, Graham and Rickwood (eds.), IRL Press, Oxford, pp. 280-285; incorporated 
herein by reference in its entirety) to isolate the membrane-associated polysomes. 

Other appropriate cell lysates or fractions may be obtained using routine 
biochemical methods. 

Specific polysomes can also be isolated using affinity separation techniques 
35 targeting nascent polypeptides or endogenous or tagged mRNA-binding proteins using art- 



(2) 
(3) 
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known methods e.g., using the methods of Lynch, 1987, Meth. Enzymol. 152: 248-253, and 
Brooks and Rigby, 2000 5 Nucleic Acids Res. 28(10): e49. 

In certain embodiments, polysomes are not isolated from the post-mitochondrial 
supernatant or even from a cell or tissue lysate before being subject to affinity purification. 
5 Once the cell lysate or fraction is obtained, the tagged ribosomes may be isolated 

using routine methods from untagged ribosomes and other cell components, preferably 
isolated from RNA, most preferably isolated from mRNA, that is not bound to molecularly 
tagged ribosomes or tagged mRNA binding protein, using affinity reagents that bind the tag 
specifically. 

10 In a preferred embodiment, the ribosomes are isolated from transfected cells by 

scraping them into homogenization buffer (50 mM sucrose, 200 mM ammonium chloride, 7 
mM magnesium acetate, 1 mM dithiothreitol, and 20 mM Tris-HCl, pH 7.6). The cells are 
then lysed by the addition of the detergent, NP-40 (Nonidet P40, 

CALBIOCHEM-NOVABIOCHEM Corporation, San Diego, California) to a concentration 
15 of 0.5% followed by five strokes in a glass dounce tissue homogenizer. Unlysed cells, 

nuclei and mitochondria are pelleted by centrifugation at 10,000Xg for 10 minutes, at 4°C. 

The supernatant is removed and layered over a two-step discontinuous gradient of 1.8 M 

and 1.0M sucrose in 100 mM ammonium chloride, 5 mM magnesium acetate, 1 mM 

dithiothreitol, 20 mM Tris-HCl (pH 7.6). The gradient is centrifuged for 18 hours at 
20 98,000Xgat4°C. 

Following centrifugation, the supernatants are removed, and the polysome pellet is 

resuspended in 100 mM ammonium chloride, 5 mM magnesium chloride, 1 mM DTT and 

20 mM Tris-HCl (pH 7.6). 

An equal volume of 2X denaturing protein electrophoresis sample buffer is added to 
25 the polysome sample. Solubilized polysomal proteins are fractionated by electrophoresis 

through a SDS containing 4- 20% gradient polyacrylamide gel, and transferred to a 

nitrocellulose filter. 

The isolation of tagged polysomes directly from crude or post-mitochondrial 

supernatants (adjusted appropriately with NaCl and detergent) is also envisioned. In certain 
30 embodiments, molecular tagging is achieved through the introduction of amino acids into a 

ribosomal protein-encoding gene such that the amino acids form a polypeptide region (i.e., a 

tag) that is capable of acting as a receptor or ligand for an affinity separation. 

Because nascent polypeptides are attached to isolated monosomes and polysomes, 

the methods of the invention can also be used to isolate newly synthesized polypeptides 
35 from a cell type of interest (e.g. , for proteomic applications). 
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Tagged polysomes that contain specific mRNAs (see infra) are isolated using 
antibodies that recognize specific nascent, encoded polypeptide chains (for review see 
Lynch DC. Use of antibodies to obtain specific polysomes. Methods Enzymol. 
1987;152:248-53; Schutz G, Kieval S, Groner B, Sippel AE, Kurtz D, Feigelson P. Isolation 

5 of specific messenger RNA by adsorption of polysomes to matrix-bound antibody. Nucleic 
Acids Res. 1977 Jan;4(l):71-84; and Shapiro SZ, Young JR. An immunochemical method 
for mRNA purification. Application to messenger RNA encoding trypanosome variable 
surface antigen. J Biol Chem. 1981 Feb 25;256(4): 1495-8). Particular mRNA species as low 
in abundance as 0.01-0.05% of total mRNA have been purified to near homogeneity via this 

10 approach. 

Affinity methods that can be used to isolate or purify tagged ribosomes or other 
mRNA binding proteins taking advantage of the affinity of a reagent for the peptide tag are 
well known in the art including chromatography, solid phase chromatography and 
precipitation, matrices, precipitation, etc. 
15 In specific embodiments, the invention provides molecularly tagged ribosomes, 

preferably bound to mRNA, that are bound to an affinity reagent for the molecular tag. In 
more specific embodiments, the molecularly tagged ribosomes are bound to an affinity 
reagent that is bound, preferably covalently, to a solid surface, such as a chromatography 
resin, e.g., agarose, sepharose, and the like. 

20 

5.4. ISOLATION OF mRNA FROM PURIFIED POLYSOMES 

Once the tagged ribosome or mRNA binding protein has been isolated, the 
associated mRNA complexed with the ribosome or mRNA binding protein may be isolated 
using methods well known in the art. For example, elution of mRNA is accomplished by 

25 addition of EDTA to buffers, which disrupts polysomes and allows isolation of bound 
mRNA for analysis (Schutz, et al (1977), Nucl. Acids Res. 4:71-84; Kraus and Rosenberg 
(1982), Proc. Natl. Acad. Sci. USA 79:4015-4019). In addition, isolated polysomes 
(attached or detached from isolation matrix) can be directly input into RNA isolation 
procedures using reagents such as Tri-reagent (Sigma) or Triazol (Sigma). In particular 

30 embodiments, poly A + mRNA is preferentially isolated by virtue of its hybridization of 

oligodT cellulose. Methods of mRNA isolation are described., for example, in Sambrook et 
al, 2001, Molecular Cloning, A Laboratory Manual, Third Edition, Cold Spring Harbor 
Laboratory Press, N.Y.; and Ausubel et al, 1989, Current Protocols in Molecular Biology, 
Green Publishing Associates and Wiley Interscience, N.Y., both of which are hereby 

35 incorporated by reference in their entireties. 
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5.5. REGULATORY SEQUENCES FOR EXPRESSION OF TAGGED 
RIBOSOMES 

According to the methods of the invention, the tagged ribosomes are selectively 
expressed in a particular chosen cell type. Such expression is achieved by driving the 

5 expression of the tagged ribosomal protein or mRNA binding protein using regulatory 
sequences from a gene expressed in the chosen cell type. 

The population of cells comprises a discernable group of cells sharing a common 
characteristic. Because of its selective expression, the population of cells may be 
characterized or recognized based on its positive expression of the characterizing gene. 

10 According to the methods of the invention, some or all of the regulatory sequences may be 
incorporated into nucleic acids of the invention (including transgenes) to regulate the 
expression of tagged ribosomal protein or mRNA binding protein coding sequences. In 
certain embodiments, a gene that is not constitutively expressed, {i.e., exhibits some spatial 
or temporal restriction in its expression pattern) is used as a source of a regulatory sequence. 

15 In other embodiments, a gene that is constitutively expressed is used as a source of a 

regulatory sequence, for example, when the nucleic acids of the invention are expressed in 
cultured cells. 

In certain embodiments, the expression of tagged ribosomal protein or mRNA 
binding protein coding sequences is regulated by a non-ribosomal regulatory sequence. 

20 Such a sequence may include, but not be limited to, parts of a ribosomal regulatory 

sequence (but does not include the entire ribosomal regulatory sequence), but such sequence 
effects a different expression pattern than the ribosomal regulatory sequence. 

Preferably, the regulatory sequence is derived form a human or mouse gene 
associated with an adrenergic or noradrenergic neurotransmitter pathway, e.g., one of the 

25 genes listed in Table 4; a cholinergic neurotransmitter pathway, e.g., one of the genes listed 
in Table 5; a dopaminergic neurotransmitter pathway, e.g., one of the genes listed in Table 
6; a GABAergic neurotransmitter pathway, e.g., one of the genes listed in Table 7; a 
glutaminergic neurotransmitter pathway, e.g., one of the genes listed in Table 8; a 
glycinergic neurotransmitter pathway, e.g., one of the genes listed in Table 9; a 

30 histaminergic neurotransmitter pathway, e.g., one of the genes listed in Table 10; a 
neuropeptidergic neurotransmitter pathway, e.g., one of the genes listed in Table 1 1; a 
serotonergic neurotransmitter pathway, e.g., one of the genes listed in Table 12; a nucleotide 
receptor, e.g., one of the genes listed in Table 13; an ion channel, e.g., one of the genes 
listed in Table 14; markers of undifferentiated or not fully differentiated cells, preferably 

35 nerve cells, e.g., one of the genes listed in Table 15; the sonic hedgehog signaling pathway, 
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e.g., one of the genes in Table 16; calcium binding, e.g., one of the genes listed in Table 17; 
or a neurotrophic factor receptor, e.g., one of the genes listed in Table 18. 

The ion channel encoded by or associated with the gene selected as the source of the 
regulatory sequence is preferably involved in generating and modulating ion flux across the 

5 plasma membrane of neurons, including, but not limited to voltage-sensitive and/or cation- 
sensitive channels, e.g., a calcium, sodium or potassium channel. 

In Tables 4-18 that follow, the common names of genes are listed, as well as their 
GeneCards identifiers (Rebhan et al. 9 1997, GeneCards: encyclopedia for genes, proteins 
and diseases, Weizmann Institute of Science, Bioinformatics Unit and Genome Center 

10 (Rehovot, Israel). GenBank accession numbers, UniGene accession numbers, and Mouse 
Genome Informatics (MGI). Database accession numbers where available are also listed. 
GenBank is the NIH genetic sequence database, an annotated collection of all publicly 
available DNA sequences (Benson et al, 2000, Nucleic Acids Res. 28(1): 15-18). The 
GenBank accession number is a unique identifier for a sequence record. An accession 

1 5 number applies to the complete record and is usually a combination of a letter(s) and 
numbers, such as a single letter followed by five digits (e.g., U 12345), or two letters 
followed by six digits (e.g., AF 12345 6). 

Accession numbers do not change, even if information in the record is changed at 
the author's request. An original accession number might become secondary to a newer 

20 accession number, if the authors make a new submission that combines previous sequences, 
or if for some reason a new submission supercedes an earlier record. 

UniGene (Schuler et al 9 1996, A gene map of the human genome, Science 
274(5287): 540-6) is an experimental system for automatically partitioning GenBank 
sequences into a non-redundant set of gene-oriented clusters for cow, human, mouse, rat, 

25 and zebrafish. Within UniGene, expressed sequence tags (ESTs) and full-length mRNA 
sequences are organized into clusters that each represent a unique known or putative gene. 
Each UniGene cluster contains related information such as the tissue types in which the 
gene has been expressed and map location. Sequences are annotated with mapping and 
expression information and cross-referenced to other resources. Consequently, the 

30 collection may be used as a resource for gene discovery. 

The Mouse Genome Informatics (MGI) Database (Jackson Laboratory, Bar Harbor, 
Maine) contains information on mouse genetic markers, mRNA and genomic sequence 
information, phenotypes, comparative mapping data, experimental mapping data, and 
graphical displays for genetic, physical, and cytogenetic maps. 

35 



48 



5 



10 



15 



20 



25 



WO 03/038049 


TABLE 4 


PCT/US02/34645 


Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 

J. i WAJll M V*. 


ADRB1 (adrenergic beta 1) 


human: J03019 


MGL87937 


ADRB2 (adrenergic beta 2) 


human: Ml 5 169 


MGI:87938 


ADRB3 (adrenergic beta 3) 


human: NM_000025, X7081 1, 
X72861, M29932, X70812, 
S53291,X70812 


MGL87939 


ADRA1A (adrenergic alpha la) 


human: D25235, U02569, 
AF0 13261, L31774, U03866 
guinea pig : AF 108016 




ADRA1B (adrenergic alpha lb) 


human: U03865, L31773 


MGI: 104774 


ADRA1C (adrenergic alpha lc) 


human: U08994 
mouse: NM 013461 




AD RAID (adrenergic alpha Id) 


human: M76446, U03864, 
L31772, D29952 ? S70782 


MGI: 106673 


ADRA2A (adrenergic alpha2A) 


human: Ml 8415, M23533 


MGI: 87934 


ADRA2B (adrenergic alpha 2B) 


human: M34041, AF005900 


MGL87935 


ADRA2C (adrenergic alpha 2C) 


human: J03853, D13538, U72648 


MGL87936 


SLC6A2 


human: X91117, M65105, 


MGI: 1270850 


Norepinephrine transporter (NET) 


AB022846, AF061198 





TABLE 5 



30 


Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 




CHRM1 (Muscarinic Ach Ml) 


human: X15263, M35128 Y00508, 


MGL88396 




receptor 


X52068 





35 
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Gene i 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 


5 


CHRM2 (Muscarinic Ach M2) 
receptor 1 


human: M16404, AB041391, 
XI 5264 

mouse: AF264049 






CHRM3 (Muscarinic Ach M3) 
receptor 


human: U29589, AB041395, 
XI 5266 

mouse: AF264050 




10 


CHRM4 (Muscarinic Ach M4) 
receptor 


human: XI 5265, Ml 6405 


MGL88399 




CHRM5 (Muscarinic Ach M5) 
receptor 


human: AF026263, M80333 
rat: NM_017362 
mouse: AI327507 




15 


CHRNA1 (nicotinic alphal) 
recentor 


human: Y00762, X02502, S77094 


MGL87885 




CHRNA2 (nicotinic alpha2) 
receptor 


human: U62431, Yl 6281 


MGI: 87886 j 




CHRNA3 (nicotinic alpha3) 
receptor 


human: NM_000743, U62432, 
M37981, M86383, Y08418 




20 


CHRNA4 (nicotinic alpha4) 
receptor 


human: U62433, L35901, Y08421, 
X89745, X87629 


MGI: 87888 




CHRNA5 (nicotinic alphaS) 
receptor 


human: U62434, Y08419, M83712 


MGL87889 


25 


CHRNA7 (nicotinic alpha7) 
receptor 


human: X70297, Y08420, Z23141, 
U40583, U62436, L25827, 
AF036903 


MGL99779 




CHRNB1 (nicotinic Beta 1) 
recentor 


human: X14830 


MGI: 87890 


30 


CHRNB2 (nicotinic Beta 2) 

L vvvU LUI 


human: U62437, X53179, Y08415, 
A J00 193 5 


MGL87891 




CHRNB3 ( nicotinic Beta 3) 
receptor 


human: Y08417, X67513, U62438, 
RIKENBB284174 






CHRNB4 (nicotinic Beta 4) 
receptor 


human: U48861, U62439, Y08416, 
X68275 


MGL87892 


35 
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30 



Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 


CHRNG nicotinic gamma 

111JUJLJ.CI.L14J. t/ UJLUOvlv X V^V^jy LU1 


human: X01715, Ml 1811 


MGI: 87895 


CHRNE nicotinic epsilon 
receptor 


human: X66403 j 
mouse: NM 009603 




CHRND nicotinic delta 
receptor 


human: X55019 


MGL87893 


TABLE 6 


Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 


th (tyrosine hydroxylase) 


human: Ml 75 89 


MGL98735 


dat (dopamine transporter) 


human: NM 001044 


MGL94862 


dopamine receptor 1 


human UniGene: X58987, S58541, 
X55760, X55758 


MGL99578 


dopamine receptor 2 


human UniGene: X51362, M29066, 
AF050737, S62137, X51645, 
M30625, S69899 


MGL94924 


dopamine receptor 3 


human UniGene: U25441, U32499 


MGL94925 


dopamine receptor 4 


human UniGene: L12398, S76942 


MGL94926 


dopamine receptor 5 


human UniGene: M67439, M67439, 
X58454 


MGL94927 


dbh 

dopamine beta hydroxylase 


human UniGene: XI 3255 


MGI:94864 


TABLE 7 


Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 


GABA A A2 

GABRA2 

GABA receptor A2 


human: S62907 


MGL95614 1 
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Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 


5 


GABAAA3 

GABRA3 

GAB A receptor A3 


human: S 62908 


MGr95615 




GAB A A A4 

GABRB4 

GAB A receptor A4 


human: NMJ)00809, U30461 


MGI-95616 


10 


GAB A A A5 
GABRB5 

GAB A receptor A5 


human: NM__000810, L08485, 
AF061785, AF061785, AF061785 






GABAAA6 

GABRB6 

GABA receptor A6 


human: S81944, AF053072 


MGI* 956 18 


15 


GABAB1 
GABRB1 

GABA receptor Bl 


human: XI 4767, M59216 


MGI- 956 19 


20 


GABA B2 

GABRB2 

GABA receptor B2 


human: S67368, S77554, S77553 
mouse:MM4707 






GABA B3 
GABRB3 

GABA receptor B3 ! 


human: M82919 


MGF95621 


25 


GABRG1 

GAB A- A receptor, gamma 
lsubunit 




MGI* 103 156 




GABRG2 

GABA-A receptor, gamma 2 
subunit 


human: XI 5376 


MGF95623 


30 


GABRG3 

GABA-A receptor, gamma 3 
subunit 


human: S 82769 




35 


GABRD 

GABA-A receptor, delta 
mbunit 


human: AF01 69 17 


MGL95622 
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Gene 


GenBank and /or UniGene 


MGI Database 
Number 




GABRE 


human: U66661, Y07637, Y09765, 




5 


GABA-A receptor, epsilon 


U92283, Y09763, U92285 






subunit 


mouse: NM 017369 






GAB A A pi 


human: U95367, AF009702 j 






GABRP 










GABA-A receptor, pi subunit 








10 


GAB A A theta 


mouse NM_020488 




GABA receptor theta 










GABA Rla 


human: M62400 


MGL95625 




GABA receptor rho 1 GABRR1 










GABA receptor rho 1 








15 


GABAR2 


human: M86868 


MGI:95626 


GABA receptor a rho 2 
GABRR2 

GABA receptor rho 2 














TABLE 8 




20 












Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 

A oof>6cion 

Number 




GRIA1 




human: NM_000827, M64752, 




2d 


GluRl 




X58633 M81886 








mouse: NM 008165 






GRIA2 




human: L20814 






GlurR2 




rat: M85035 
mouse: AF250875 




iU 


GRIA3 




human: U10301, X82068, U10302 




GluR3 




rat:M85036 






GRIA4 




human: U16129 






GluR4 




rat:NM 017263 






GRIK1 




human: L19058, U16125, 


MGL95814 


35 


glutamate ionotropic kainate 1 




AF 107257, AF 107259 




gluR5 
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Gene 


GenBank and /or UniGene 


MGI Database 

1* CO » 111 11 

Number 


5 i 


GRIK2 
gluR6 


human: U16126 

mouse: NM_0 10349, RIKEN 

BB359097 






GRIK3 
gluR7 


human: U16127 
mouse: AF245444 




10 


GRIK4 
KA1 


human: S67803 


MGI:95817 


GRIK5 

KA2 


human: S40369 


MGI:95818 




GRIN1 
NRlnmdarl 
NMDA receptor 1 


human: D13515, L05666, L13268, 
L13266, AF015731, AF015730, 
U08106, L13267 


MGI:95819 


15 


GRIN2A 
NR2A 

NMDA receptor 2A 


human: NM_000833, U09002, 
U90277 

mouse: NM 008170 




20 


GRIN2B 
NR2B 

NMDA receptor 2B 


human: NM_000834, U11287, 
U90278, U88963 


MGI:95821 




GRIN2C 
NR2C 

NMDA receptor 2C 


human: U77782, L76224 


MGI:95822 


25 


GRIN2D 
NR2D 

NMDA receptor 2D 


human: U77783 


MGI:95823 


30 


GRM1 

mGluR la and lb alternate 
splicing type I 
mGluRla 


human: NMJD00838, L76627, 

AL035698, U31215, AL035698, 

U31216, L76631 

mouse: BB275384, BB 18 1459, 

BB177876 






GRM2 

mGluR 2 type II 
mGluR2 


human: L35318 
Sheep: AF229842 




35 
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Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 




GRM3 


human: X77748 




5 


mGluR3 type II 


mouse: AH008375; MM45836 






mGluR3 








GRM4 


human: X80818 






mGluR4 type III 








mGluR4 






10 


GRM5 

mGluR5a and 5b alt splice 32 

residues 

mGluR5 


human: D28538, D28539 
mouse: AF140349 






GRM6 


human: NM_000843, U82083, 




15 


mGluR6 type IE 


AJ245872, AJ245871 




mGluR6 


rat: AJ245718 






GRM7 


human: NM_000844, X94552 






mGluR7 type III 


mouse: RIKEN BB3 57072 






mGluR7 






20 


GRM8 

mGluR8 type III 
mGluR8 


human: NM_000845, U95025, 
AJ236921, AJ236922, AC000099 
mouse: U17252 






GRID2 


human: AF009014 


MGL95813 




glut ionotropic delta 






25 


excitatory amino acid 


human: U03505, U01824, Z32517, 


MGI: 101 931 


transported 

glutamate/aspartate transporter II 
glutamate transporter GLT1 
glutamate transporter SLC1A2 


D85884 




30 


glial high affinity glutamate 






transporter 








EAAC1 


human: U08989, U03506, U06469 


MGI: 105083 




neural SLC1A1 








neuronal/epithelial high affinity 








glutamate transporter 







35 
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GenBank and /or UniGene 
Accession Number 


MGI Database | 
Accession 
Number 




EE ATI 


human: D26443, AF070609, 


MGL99917 


5 


SLC1A3 


L19158, U03504, Z31713 






glial high affinity glutamate 
transporter 








EAAT4 


human: U18244, AC004659 


MGI: 1096331 ! 




neural SLC1A6 






10 


high affinity aspartate/glutamate 






transporter 







TABLE 9 



15 


Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database ! 
Accession 
Number 




Glycine receptors alpha 1 


human:X52009 


MGL95747 




GLRA1 








Glycine receptors alpha 2 


human:X52008, AF053495 


MGI:95748 


20 


GLRA2 






Glycine receptors alpha 3 
GLRA3 


human: AF017724, U93917, 

AF018157 

mouse: AF214575 






Glycine receptors alpha 4 


no human 




25 


GLRA4 


mouse: X75850, X75851, X75852, 






X75853 






glycine receptor beta 


human: U33267, AF094754, 


MGL95751 




GLRB 


AF094755 





30 
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5 



Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 


Histamine HI -receptor 1 


human: Z34897, D28481, X76786, 
AB041380, D14436 5 AF026261 


MGL107619 


Histamine H2-receptor 2 


human: M64799, AB023486, AB041384 


MGI: 108482 


Histamine H3 -receptor 3 


human: NMJ307232 
mouse: MM31751 





TABLE 11 



15 


Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 




orexm OX-A 


human: AF041240 


MGL1202306 




hypocretin 1 








Orexin B 






20 


Orexin receptor OX1R 
HCRTR1 


human: AF041243 






Orexin receptor OX2R 


human: AF041245 






HCRTR2 








leptinR-long 


human: U66497, U43168, U59263, 


MGI: 104993 


25 


Leptin receptor long form 


U66495, U52913, U66496, 






U52914, U52912, U50748, 
AK001042 [ 






MCH 


human: M57703, S63697 






melanin concentrating hormone 






30 


PMCH 1 






MC3R 

MC3 receptor 
melanocortin 3 receptor 


human: GDB: 138780 
mouse: MM57183 


MGL96929 




MC4R 


human: S77415, L08603, 




35 


MC4 receptor 


NM_005912 




melanocortin 4 receptor 







57 



WO 03/038049 



PCT/US02/34645 





Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number ! 




MC5R 


human: L27080, Z25470, U08353 


MGI:99420 


5 


MC5 receptor 








melanocortin 5 receptor 








prepro-CRF 


human: V00571 






corticotropin-releasing factor 


rat: X03036, M54987 






precursor 






10 


CRH 

corticotropin releasing hormone 








CRHR1 


human: L23332, X72304, L23333, 


MGL88498 




CRH/CRF receptor 1 


AF039523, U16273 






CRFR2 


human: U34587, AF0 19381, 


MGL894312 


15 


CRH/CRF receptor 2 


AF011406, AC004976, AC004976 




CRHBP 

CRF binding protein 


human: X58022, S60697 


MGL88497 




Urocortin 


human: AF038633 


MGI: 1276123 




POMC 


human: V01510, M38297, J00292, 


MGI:97742 


20 


Proopiomelanocortin 


M28636 




CART 

cocaine and amphetamine 
regulated transcript 


human: U20325, U16826 


MGI: 135 1330 




NPY 


human: K01911, M15789, 


MGI:97374 


25 


Neuropeptide Y 


Ml 4298, AC004485 




prepro NPY 








NPY1R 


human: M88461, M84755, 


MGI: 104963 




NPY Yl receptor 

Jr 


NM_000909 






Neuropeptide Yl receptor 








NPY2R 


human: U42766, U50146, U32500, 


MGI: 1084 18 


30 


NPY Y2 receptor 
Neuropeptide Y2 receptor 


U36269, U42389, U76254, 
NM 000910 






NPY Y4 receptor 


human: Z66526, U35232, U42387 


MGI: 105374 




Npy4R Neuropeptide Y4 receptor 








(mouse) 
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10 



15 



20 



25 



30 



Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 


NPY Y5 receptor 

Npy5R Neuropeptide Y5 receptor 

(mouse) 


human: U94320, U56079, U66275 
mouse: MM10685 


MGI: 108082 


NPY Y6 receptor 

Npy6r Neuropeptide Y receptor 

(mouse) 


human: D86519, U59431, U67780 


MGI: 1098590 


CCK 

cholecystokinin 


human: NM_000729, L00354 


MGP88297 


CCKa receptor 

CCKAR cholecystokinin receptor 


human: LI 93 15, D85606, LI 3605 
U23430 


MGL99478 


CCKb receptor 

CCKBR cholecystokinin receptor 


human: D13305, L04473, L08112, 
L07746,L10822,D21219, 
S70057, AF074029 


MGL99479 


AGRP 

agouti related peptide 


human: NM 001 138, U88063, 
U89485 


MGL892013 


Galanin 


human: M77140, LI 1144 


MGL95637 


GALP 

Galanin like peptide 
See, Jureus et al. , 2000, 
Endocrinology 141(7):2703-06. 






GalRl receptor 
GALNR1 
galanin receptor 1 


human: NM_001480, U5351 1, 
L34339, U23854 


MGI: 1096364 


GalR2 receptor 
GALNR2 
galanin receptor2 


human: AF040630, AF080586, 
AF042782 


MGL1337018 


GalR3 receptor 

GALNR3 

Galr3 

galanin receptor3 


human: AF073799, Z97630, 
AF067733 


MGI: 1329003 


UTS2 

prepro-virotensin II 


human: Z98884, AF104118 


MGI: 1346329 



35 
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Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
1 Number 




GPR14 


human: AI263529 






Urotensin receptor 


mouse: AI3 85474 




5 


SST 

somatostatin 


human: J00306 


MGL98326 




SSTR1 


human: M81829 


MGI:98327 




somatostatin receptor sstl 






10 


SSTR2 

somatostatin receptor sst2 


human: AF184174 M81830 
AF184174 


MGP98328 




SSTR3 


human: M96738, Z82188 


MGI:98329 




somatostatin receptor sst3 








SSTR4 


human: L14856, L07833, D16826, 


MGI: 105372 


15 


somatostatin receptor sst4 


AL049651 




SSTR5somatostatin receptor sst5 


human: D16827, L14865, 
AL031713 


MGL894282 




GPR7 


human: U22491 


MGL891989 




G protein-coupled receptor 7 






20 


opioid-somatostatin-like receptor 






GPR8 

G protein-coupled receptor 8 
opioid-somatostatin-like receptor 


human: U22492 






PENK (pre Pro Enkephalin) 


human: V00510, J00123 


MGI: 104629 




PDYN (Pre pro Dynorphin) 


human: K02268, AL034562, 


MGL97535 


25 




X00176 






OPRM1 


human: L25119, L29301, U12569, 


MGI:97441 




u opiate receptor 


AL132774 






OPRK1 


human: U11053, L37362, U17298 


MGL97439 




c opiate receptor 






30 


OPRD1 

delta opiate receptor 


human: U07882, U10504, 
AL009181 


MGL97438 




OPRL1 


human: X77130, U30185 


MGL97440 




ORL1 opioid receptor-like 








receptor 
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Gene 


GenBank and /or UniGene 
Accession i^iuniDer 


MGI Database 
Accession 
Number 


5 


VR1 

Vanilloid receptor subtype 1 


hviman: NM_0 18727, BE466577 
mouse: BE623398, 




10 


VRL-1 

vanilloid receptor- like protein 1 
VR1L1 

vanilloid receptor type 1 like 
protein 1 VRL1 

vanilloid receptor-like protein 1 


human: NM_0 15930 
rat: AB040873 
mouse: NM_0 11706 


MGI:1341836 


15 


VR-OAC 

vanilloid receptor-related 
osmotically activated channel 


human: AC007834 




CNR1 

cannaboid receptors CB1 


human: U73304, X81120, X81120, 
X54937, X81121 


MGI: 10461 5 




EDN1 

endothelin 1 ET-1 


human: J05008, Y00749, S56805, 
Z98050, M25380 


MGL95283 


20 


GHRH 

growth hormone releasing 
hormone 


human: L00137, AL031659, 
LOO 137 


MGL95709 




GHRHR 

growth hormone releasing 
hormone receptor ! 


human: AF029342, U34195, 
mouse: NM_01 0285 




25 


PNOC 

nociceptin orphanin FQ/nocistatin 


human: X97370, U48263, X97367 


MGI: 105308 




NPFF 

neuropeptide FF precursor 


human: AF005271 
mouse: RIKEN BB365815 




30 


neuropeptide FF receptor 
neuropeptide AF receptor 
G-protein coupled receptor 
HLWAR77 

G-protein coupled receptor 
NPGPR 


human: AF257210, NM_004885, 
AF119815 
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Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 




GRP 


human: K02054, S67384, S73265, 


MGL95833 


5 


gastrin releasing peptide 


M12512 






preprogastrin-releasing peptide 








GRPR 


human: M73481, U57365 


MGL95836 




gastrin releasing peptide receptor 








BB2 






10 


NMB 

neuromedin B 


human: M21551 
mouse: AI327379 






NMBR 


human: M73482 


MGI: 1100525 




neuromedin B receptor BB1 








BRS3 


human: Z97632, L08893, X76498 




15 


bombesin like receptor subtype-3 


mouse: ABO 10280 




uterine bombesin receptor 








GCG PROglucagon 


human: J04040, X03991, V01515 


MGL95674 




GLP-1 








GLP-2 






20 


GCGR 


human: U03469, L20316 


MGL99572 


glucagon receptor 








GLP1R 


human: AL035690, U01 104, 


MGL99571 




GLP1 receptor 


U01157,L23503,U01156, 
U10037 




25 


GLP2R 


human: AF1 05367 i 




GLP2 receptor 


mouse; AF 166265 






VIP 


human: M36634, M54930, 


MGL98933 




vasoactive intestinal peptide 


M14623, M33027, Ml 1554, 
L00158, M36612 






SCT 


mouse: NM_01 1328, X73580 j 




30 


secretin 








PPYR1 


human: Z66526, U35232, U42387 


MGL105374 




pancreatic polypeptide receptor 1 








OXT 


human: M25650, Ml 1 186, 






pre pro Oxytocin 


X03173 




35 




mouse: NM. 01 1025, M88355 
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Gene 
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Accession Number 


MGI Database 
Accession 
Number 


5 


OXTR 
OTR 

oxytocin receptor 


human: X64878 


MGI:109147 \ 




AVP 

Preprovasopressin 


human: M25647, X03172, 
Ml 1 166, AF03 1476, X62890, 
X62891 


MGI:88121 


10 


AVPR1A 
Via receptor 
vasopressin receptor la 


human: U19906, L25615, S73899, 
AF030625, AF101725 
mouse: NM 016847 




15 


AVPR1B 
Vlb receptor 
vasopressin receptor lb 


human: D31833, L371 12, 
AF030512, AF101726 
mouse: NM 011924 




AVPR2 
V2 receptor 
vasopressin receptor2 


human: Zl 1687, U04357, L22206, 
U52112, AF030626, AF032388, 
AF101727 ,AF101728 


MGL88123 


20 


NTS 

proneurotensin/proneuromedin N 
Neurotensin tridecapeptide plus 
neuromedin N 


human: NM_006183, U91618 
mouse: MM64201 






NTSR1 

Neurotensin receptor NT1 


human: X70070 


MGL97386 


25 


NTSR2 

Neurotensin receptor NT2 


human: Y10148 
mouse: NM 008747 






SORT1 

sortilin 1 neurotensin receptor 3 


human: X98248, L10377 


MGI:1338015 




BDKRB1 

Bradykinin receptor 1 


human: U12512, U48231, U22346, 
AJ238044, AF1 17819 


MGL88144 


30 


BDKRB2 

Bradykinin receptor B2 


human: X69680, S45489, S56772, 
M88714, X86164, X86163, 
X86165 


MGI: 102845 


35 


GNRH1 
GnRH 

gonadotrophin releasing hormone 


human: X01059, M12578, X15215 


MGL95789 
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Accession Number 


MGI Database ; 
Accession 
Number ] 


5 


GNRH2 
GnRH 

gonadotrophin releasing hormone 


human: AF036329 




10 


GNRHR 
GnRH 

gonadotrophin releasing hormone 
receptor 


human: NM_000406, L07949, j 
S60587, L03380, S77472, Z81 148, 
U19602 


MGI:95790 


CALCB 

calcitonin-related polypeptide, 
beta 


human: X02404, X04861 




15 


CALCA 

calcitonin/calcitonin-related 

UUlJ JLJ w L_/ UUv^ CXXLJXXCL 


human: M26095, X00356, 
X03662, M64486, M12667, 
X02330 XI 5943 


MGL88249 




CALCR 

calcitonin recentor 


human: L00587 


MGI: 101 950 




TAC1 (also called tac2) 
neurokinin A 


human: X54469, U37529, 
AC004140 


MGL98474 


20 


TAC3 

neurokinin T-i 

JLXW tsl-L V/AV1 J. J.JLXX J — ' 


human: NM_01 3251 
rat-NM 017053 






TACR2 

neurokinin a (subK) receptor 


human: M75105, M57414, 
M60284 




25 


TACR1 

tachykinin receptor NK2 (Sub P 
andK) 


human: M84425, M74290, 
M81797, M76675, X65177, 
M84426 


MGL98475 




TACR3 

tachykinin receptor NK3 (Sub P 
and K) neuromedin K 


human: M89473 X65172 




30 


ADCYAP1 
PACAP 


human: X60435 


MGI: 105094 
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MGI Database 
Accession 
Number 


5 

10 


NPPA 

atrial naturietic peptide (ANP) 
precursor 

atrial natriuretic factor (ANF) 
rvrecursor 

pronatriodilatin precursor 
prepronatriodilatin 


human: M54951, X01470, 
AL021 155, M30262, K02043, 
K02044 


MGI:97367 


NPPB 

atrial naturietic peptide (BNP) 
nrecursor 

MA Ul Owl 


human: M25296, AL021155, 
M31776 

mouse: NM 008726 




15 


NPR1 

naturietic nentide recentor 1 


human: XI 5357, AB010491 


MGL97371 


NPR2 

naturietic peptide receptor 2 


human: L13436, AJ005282, 
AB005647 


MGP.97372 




NPR3 

naturietic peptide receptor 3 


human: M59305, AF025998, 
X52282 


MGP97373 


20 


VIPR1 

VPAC1 

VIP receptor 1 


human: NM_004624, L13288, 
X75299, X77777, L20295, 
U11087 


MGI: 109272 




VIPR2 

VIP receptor 2 
PACAP receptor 


human: X95097, L36566, Y18423, 
L40764, AF027390 


MGI: 107 166 



25 



TABLE 12 



35 



Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 


5HT1A 

serotonin receptor 1 A 


human: M83181, AB041403, 
M28269, XI 3 556 


MGL96273 


5HT2A 

serotonin receptor 2A 


human: X57830 


MGL109521 
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Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 




5HT3 


human: AJ005205, D49394, S82612, 


MGL96282 


5 


serotonin receptor 3 


AJ005205, AJ003079, AJ005205, 








AJ003080, AJ003078 






5HT1B 


human: M81590, M81590, D10995, 


MGL96274 




5HTlDb 


M83180, L09732, M75128, 






serotonin receptor IB 


AB041370, AB041377, AL049595 




10 


5HT1D alpha 


human: AL049576 


MGL96276 


serotonin recentor ID 








5HT1E 


human: NM_000865, M91467, 






serotonin recentor IE 


M92826 Z11166 






5HT2B 


human: NM_000867, X77307, 


MGI: 109323 


15 


serotonin receptor 2B 


Z36748 




5HT2C 

serotonin receptor 2C 


human: NM_000868, U49516, 
M81778, X80763, AF208053 


MGI: 96281 




5HT4 


human: Y10437, Y08756, Y09586, 






serotonin receptor 4 


Y13584, Y12505, Y12506, Y12507, 






(has 5 subtvnes isoforms^ 


AJ0 11371 AJ243213 




20 


5HT5A 

serotonin receptor 5A 


human: X81411 


MGL96283 




5Ht5B 


rat: LI 0073 






serotonin receptor 5B 








5HT6 


human: L41147, AF007141 




25 


serotonin receptor 6 








5HT7 


human: U68488, U68487 L21195 






serotonin receptor 7 


X98193 

mouse: MM8053 






sert 


human UniGene: L05568 


MGI:96285 


30 


serotonin transporter 








TPRH 


human UniGene: AF057280, X52836, 


MGL98796 




TPH (Tph) 


L29306 






tryptophan hydroxylase 







35 
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Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
[ Number 


P2RX1 
P2xl receptor 

purinergic receptor P2X, ligand-gated 
ion channel 


human: U45448, X83688, 
AF078925, AF020498 


MGI: 1098235 


10 


P2RX3 

purinergic receptor P2X, ligand-gated 
ion channel, 3 


human: Y07683 

mouse: RIKEN BB459124, 

RIKENBB452419 






P2RX4 


human: U83993, Y07684, 


MGI: 133 8859 




purinergic receptor P2X, ligand-gated 


U87270, AF000234 




15 


ion channel, 4 






P2RX5 

purinergic receptor P2X, ligand-gated 
ion channel, 5 


human: AF1 68787, 
AF016709, U49395, U49396, 
AF168787 
rat: AF070573 




20 


P2RXL1 


human UniGene: AB002058 


MGL1337113 


purinergic receptor P2X-like 1, 

orphan receptor 

P2RX6 








P2RX7 i 


human: Y09561, Y12851 


MGI: 1339957 


25 


purinergic receptor P2X, ligand-gated 






ion channel, 7 








P2RY1 


human: Z49205 | 


MGI: 105049 




purinergic receptor P2Y, G-protein 








coupled 1 






30 


P2RY2 


human: U07225 S74902 




purinergic receptor P2Y, G-protein 
coupled, 2 


rat: U56839 






P2RY4 pyrimidinergic receptor P2Y, 


human: X91852, X96597, 






G-protein coupled, 4 


U40223 
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Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
1 Number 


P2RY6 

pyrimidinergic receptor P2Y, G- 
protein coupled, 6 


human: X97058, U52464, 
AF007892, AF007891, 
AF007893 




P2RY11 

purinergic receptor P2Y, G-protein 
coupled, 1 1 


human: AF030335 





10 

TABLE 14 



15 


Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 




SCN1A 


human: X65362 


MGI:98246 




sodium channel, voltage-gated, 








type I, alpha 










human: L16242, L10338, U12194, 


MGL98247 


20 


sodium channel, voltage-gated, 
type I, beta 


NM 001037 






SCN2B 


human: AF049498, AF049497, 


MGI: 106921 




sodium channel, voltage-gated, 


AF007783 






type II, beta 






25 


SCN5A | 


human: M77235 




sodium channel, voltage-gated, 
type V, alpha 








SCN2A1 




MGL98248 j 




sodium channel, voltage-gated, 






30 


type II, alpha 1 






SCN2A2 | 
sodium channel, voltage-gated, 
type II, alpha 2 


human: M94055, X65361, M91803 \ 






SCN3A 


human: AB037777, AJ251507 


MGL98249 


35 i 


sodium channel, voltage-gated, 






type III, alpha 
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Gene 


ijenlsanK ana /or univrene 
Accession Number 


IVJLvFA JL'rt 1*1 AJ<1>C 

Accession 
Number 




SCN4A 


human: M81758, L01983, L04236, \ 


MGL98250 




sodium channel, voltage-gated, 


U24693 






type IV, alpha 








SCN6A 


human: M91556 






sodium channel, voltage-gated, 








type VII or VI 






1 0 


SCN8a 


human: AF225988, AB027567 j 


MGL103169 


SCN8A sodium channel, 
voltage-gated, type VIII 








SCN9A 


human: X82835, RIKEN BB468679 






sodium channel, voltage-gated, 


mouse: MM40146 






type IX, alpha 






15 


SCN10A 

sodium channel, voltage-gated, 
type X, 


human :NM_006514, AF1 17907 






SCN11A 


human: AF 18 8679 


MGI: 1345 149 


20 


sodium channel, voltage-gated, 






type XI, alpha 








SCN12A 


human: NM_014139 






sodium channel, voltage-gated, 








type XII, alpha 








SCNN1A 


human: X76180, Z92978, L29007, 


MGF.101782 


25 


sodium channel, nonvoltage- 
gated 1 alpha 


U81961, U81961, U81961, U81961, 
U81961 






SCN4B 








sodium channel, voltage-gated, 








type IV, beta 






30 


SCNN1B 

sodium channel, nonvoltage- 
gated 1, beta 


human: X87159, L36593, 
AJOOS383, AC002300, U16023 






SCNN1D 


human: U38254 






sodium channel, nonvoltage- 






35 


gated 1, delta 
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MGI Database 
Accession 
Number 


SCNN1G 

sodium channel, nonvoltage- 
gated 1 , gamma 


human: X87160, L36592, U35630 


MGI: 104695 


CLCN1 

chloride channel 1, skeletal 
muscle 


human: Z25884, Z25587, M97820, 
Z25753 


MGI: 884 17 \ 


CLCN2 

chloride channel 2 


human: AF026004 


MGI: 105061 


CLCN3 

chloride channel 3 
CIC3 


human: X78520, AL1 17599, 
AF029346 


MGI: 103555 


CLCN4 

chloride channel 4 


human: ABO 19432 X77197 


MGI: 104567 


CLCN5 

chloride channel 5 


human: X9 1906, X81836 


MGL99486 


CLCN6 

chloride channel 6 


human: D28475, X83378, 
AL021 155, X99473, X99474, 
X96391, AL021155, AL021155, 
X99475, AL021155 


MGI: 1347049 


CLCN7 

chloride channel 7 


human: AL03 1600, U88844, j 
Z67743, AJ001910 


MGI: 1347048 


CLIC1 

chloride intracellular channel 1 


human: X87689, AJ012008, 
X87689, U93205, AF129756 




CLIC2 

chloride intracellular channel 2 


human: NM_001289 




CLIC3 

chloride intracellular channel 3 


human: AF 102 166 




CLIC5 

chloride intracellular channel 5 


human: AW8 16405 




CLCNKB 

chloride channel Kb 


human: Z30644 ,S80315, U93879 




CLCNKA 
chloride channel Ka 


human: Z30643, U93878 


MGI: 1329026 
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Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database \ 
Accession 
Number j 




CLCA1 


human: AF039400, AF039401 


MGI: 13 16732 I 


5 


chloride channel, calcium 








activated, family member 1 








CLCA2 


human: AB026833 






chloride channel, calcium 








activated, family member 2 






10 


CLCA3 

chloride channel, calcium 
activated, family member 3 


human:NM_004921 






CLCA4 


human: AK000072 






chloride channel, calcium 






15 


activated, family member 4 






KCNA1 kvl.l 
potassium voltage-gated 
channel, shaker-related 
subfamily, member 1 


human: L02750 


MGL96654 


20 


KCNA2 


human: Hs.248139, L02752 


MGL96659 


potassium voltage-gated 
channel, shaker-related 
subfamily, member 2 


mouse: MM56930 






KCNA3 


human: M85217, L23499, M38217, 


MGL96660 


25 


potassium voltage-gated : 


M55515 




channel, shaker-related 
subfamily, member 3 








KCNA4 


human: M55514, M60450, L02751 


MGL96661 




potassium voltage-gated 






30 


channel, shaker-related 






subfamily, member 4 








KCNA4L 








potassium voltage-gated : 








channel, shaker-related 








subfamily, member 4-like 
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MGI Database 
Accession 
Number 




KCNA5 


human: Hs.150208, M55513, 


MGL96662 




potassium voltage-gated 


M83254, M60451, M55513 






channel, shaker-related 
subfamily, member 5 


mouse: MM1241 






KCNA6 


human: X17622 


MGL96663 




potassium voltage-gated 






10 


channel, shaker-related 
subfamily, member 6 








KCNA7 




MGL96664 




potassium voltage-gated 








channel, shaker-related 






15 


subfamily, member 7 






KCNA10 

potassium voltage-gated 
channel, shaker-related 
subfamily, member 10 


human: U96110 




20 1 


KCNB1 


human: L02840, L02840, X68302, 


MGL96666 


potassium voltage-gated 
channel, Shah-related 
subfamily, member 1 


AF026005 






KCNB2 


human: Hs. 121498, U69962 




25 


potassium voltage-gated 


mouse: MM1 54372 




channel, Shab-related 
subfamily, member 2 








KCNC1 


human: L00621, S56770 


MGL96667 




potassium voltage-gated 






30 


channel, Shaw-related 






subfamily, member 1 | 








KCNC2 




MGI:96668 




potassium voltage-gated 








channel, Shaw-related 








subfamily, member 2 







35 
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KCNC3 

potassium voltage-gated 
channel, Shaw-related 
subfamily, member 3 


human: AF055989 


MGL96669 


KCNC4 

potassium voltage-gated 
channel, Shaw-related 
subfamily, member 4 


human: M64676 


MGI.-96670 


KCND1 

potassium voltage-gated 
channel, Shal-related family, 
member 1 


human: AJ005898, AF166003 


MGL96671 \ 


KCND2 

potassium voltage-gated 
channel, Shal-related subfamily, 
member 2 


human: AB028967, AJ0 10969, 
AC004888 




KCND3 

potassium voltage-gated 
channel, Shal-related subfamily, 
member 3 


human: AF120491, AF048713, 
AF048712, AL049557 




KCNE1 

potassium voltage-gated 
channel, Isk-related family, 
member 1 


mouse : NM_008424 




KCNE1L 

potassium voltage-gated 
channel, Isk-related family, 
member 1-like 


human: AJ0 1 2743, NM_0 12282 




KCNE2 

potassium voltage-gated 
channel, Isk-related family, 
member 2 


human: AF302095 j 
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ureiusanK ana /or unnj-ene 
Accession Number 


Accession 
Number 




KCNE3 


human: NM_005472, 




5 


potassium voltage-gated 


rat: AJ271742 






channel, Isk-related family, 
member 3 


mouse: MM1 8733 






KCNE4 


mouse: MM24386 






potassium voltage-gated 






i n 


channel, Isk-related family, 






member 4 








KCNF1 


human:AF033382 






potassium voltage-gated 








channel, subfamily F, member 1 






1 ^ 

I D 


KCNG1 


human: AF033383, AL050404 




potassium voltage-gated 
channel, subfamily G, member 
1 








KCNG2 


human: NM_012283 




Of) 

ZAj 


potassium voltage-gated 






channel, subfamily G, member 

2 








KCNH1 


human: AJ001366, AF078741, ' 






potassium voltage-gated 


AF078742 






channel, subfamily H (eag- 


mouse: NM_0 10600 




related), member 1 








KCNH2 


human: U04270, AJ01 0538, j 


MGI: 134 1722 




potassium voltage-gated 


AB009071, AF052728 






channel, subfamily H (eag- 








related), member 2 






30 


KCNH3 

potassium voltage-gated 
channel, subfamily H (eag- 1 
related), member 3 


human: AB022696, AB033108, 
Hs.64064 

mouse: NM_010601, MM1 00209 


i 



35 
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KCNH4 

potassium voltage-gated 
channel, subfamily H (eag- 
related), member 4 


human: AB022698 
rat: BEC2 




KCNH5 

potassium voltage-gated 
channel, subfamily H (eag- 
related), member 5 


human: Hs.27043 
mouse: MM44465 




KCNJ1 

potassium inwardly-rectifying 
channel, subfamily J, member 1 


human: U03884, U12541, U12542, 
U12543 

rat:NM 017023 




KCNJ2 

potassium inwardly-rectifying 
channel, subfamily J, member 2 


human: U16861, U12507, U24055, 
AF01 1904, U22413, AF021 139 


MGI: 104744 


KCNJ3 

potassium inwardly-rectifying 
channel, subfamily J, member 3 


human: U50964 U39196 
mouse: NM_008426 




KCNJ4 

potassium inwardly-rectifying 
channel, subfamily J, member 4 


human: Hs.32505, U07364, Z97056, 
U24056, Z97056 
mouse: MM104760 


MGI: 104743 


KCNJ5 \ 
potassium inwardly-rectifying 
channel, subfamily J, member 5 


human: NM_000890 


MGI: 104755 


KCNJ6 | 
potassium inwardly-rectifying 
channel, subfamily J, member 6 


human: Hs.11173, U52153, D87327, 
L78480, S78685, AJ001894 
mouse: NM_0 10606, MM4276 
rat:NM 013192 




KCNJ8 

potassium inwardly-rectifying 
channel, subfamily J, member 8 


human: D50315, D50312 


MGI: 1100508 


KCNJ9 

potassium inwardly-rectifying 
channel, subfamily J, member 9 


human: U52152 


MGI: 108007 
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KCNJ10 


human: Hs.66727, U52155, U73192, 


MGI: 1194504 


5 


potassium inwardly-rectifying 


U73193 






channel, subfamily J, member 
10 








KCNJ11 


human: Hs.24814-1, D50582 


MGI: 107501 




potassium inwardly-rectifying 


mouse: MM4722 




10 


channel, subfamily J, member 
11 








KCNJ12 


human: AF005214, L36069 


MGI: 108495 




potassium inwardly-rectifying 








channel, subfamily J, member 






15 


12 






KCNJ13 

potassium inwardly-rectifying 
channel,subfamily J, member 
13 


human: AJ007557, ABO 13 889, 
AF06 1118, AJ006 1 28, AF082 1 82 
rat: AB034241, AB013890, 




20 




AB034242 






guinea pig: AF200714 






KCNJ14 








potassium inwardly-rectifying 


human: Hs.278677 






channel , subfamily J, member 


mouse: Kir2.4, MM68170 




25 


14 






KCNJ15 

potassium inwardly-rectifying 
channel, subfamily J, member 
15 


human: Hs.17287, U73191, D87291, 
Y10745 

mouse: AJ012368, kir4.2, MM44238 




30 


KCNJ16 


human:NM_018658, Kir5.1 




potassium inwardly-rectifying 
channel, subfamily J, member 1 


mouse: ABO 161 97 






KCNK1 


human: U76996, U33632 ,U90065 


MGI: 109322 




potassium channel, subfamily 








K, member 1 (TWIK-1) 
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KCNK2 


human: AF00471 1, RIKEN 




5 


potassium channel, subfamily 


BB 116025 






K, member 2 (TREK- 1) 








KCNK3 


human: AF006823 


MGI: 11 00509 




potassium channel, subfamily 








K, member 3 (TASK) 






10 


KCNK4 






potassium inwardly-rectifying 
channel, subfamily K, member 
4 


human: AF247042, AL1 17564 
mouse: NM_008431 






KCNK5 


human: NM_003740, AK001897 




15 


potassium channel, subfamily 


mouse: AF259395 




K, member 5 (TASK-2) 








KCNK6 


human: AK022344 






potassium channel, subfamily 








K, member 6 (TWIK-2) 






20 


KCNK7 


human: NM_005714 


MGI: 1341841 


potassium channel, subfamily 
K, member 7 


mouse: MM23020 






KCNK8 


mouse: NM_0 10609 






potassium channel, subfamily 








K, member 8 






25 


KCNK9 

potassium channel, subfamily 
K, member 9 


human: AF2 12829 
guinea pig: AF2 12828 






KCNK10 


human: AF279890 




30 


potassium channel, subfamily 






K, member 10 (TREK2) 








KCNN1 


human: NM_002248, U69883 






potassium intermediate/small 








conductance calcium-activated 






35 


channel, subfamily N, member 
1 
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KCNN2 


mouse: MM63515 




5 


potassium intermediate/small 








conductance calcium-activated 
channel, subfamily member 2 
(hsk2) 








KCNN4 


human: Hs. 10082, AF022797, 


MGI: 1277957 


10 


potassium intermediate/small 


AF033021, AF000972, AF022150 




conductance calcium-activated 
channel, subfamily N, member 
4 


mouse: MM9911 






KCNQ1 


human: U89364, AF000571, 


MGI-.108083 


15 


potassium voltage-gated 


AF051426, AJ006345, AB015163, 




channel, KQT-like subfamily, 
member 1 


AB015163, AJ006345 






KCNQ2 


human: Y15065, D82346, 


MGL1309503 




potassium voltage-gated 


AF033348, AF074247, AF1 10020 




20 


channel, KQT-like subfamily, 






member 2 








KCNQ3 


human:NMJ)04519, AF033347, 


MGL1336181 




potassium voltage-gated 


AF071491 






channel, KQT-like subfamily, 






25 


member 3 






KCNQ4 

potassium voltage-gated 
channel, KQT-like subfamily, 
member 4 


human: Hs.241376, AF105202, 

AF105216 

mouse: AF249747 




30 


KCNQ5 


human: NMJ)19842 




potassium voltage-gated 
channel, KQT-like subfamily, 
member 5 







35 
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KCNS1 


human: AF043473 




5 


potassium voltage-gated 


mouse: NM_008435 






channel, delayed-rectifier, 
subfamily S, member 1 








KCNS2 


mouse: NM_008436 






potassium voltage-gated 






10 


channel, delayed-rectifier, 






subfamily S, member 2 








KCNS3 


human: AF043472 






potassium voltage-gated 








channel, delayed-rectifier, 






15 


subfamily S, member 3 






KCNAB1 

potassium voltage-gated 
channel, shaker-related 
subfamily, beta member 1 


L39833, U33428, L47665, X83127, 
U16953 


MGI: 1091 55 


20 


KCNAB2 


human: U33429, AF044253, 




potassium voltage-gated 
channel, shaker-related 
subfamily, beta member 2 


AF029749 

mouse: NM_0 10598 






KCNAB3 


human: NM_004732 


MGI: 1336208 J 


25 


potassium voltage-gated 


mouse: MM57241 




channel, shaker-related 
subfamily, beta member 3 








KCNJN1 


human: Hs.248143, U53143 






potassium inwardly-rectifying 








channel, subfamily J, inhibitor 1 






30 


KCNMA1 

potassium large conductance 
calcium-activated channel, 
subfamily M, alpha member 1 


human: Ul 1058, U13913, U11717, 
U23767, AF025999 


MGI:99923 
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kcnma3 


mouse: NM_008432 




5 


potassium large conductance 








calcium-activated channel, 
subfamily M, alpha member 3 








KCNMB1 


rat: NMJU9273 






potassium large conductance j 






10 


calcium-activated channel, 
subfamily M, beta member 1 








KCNMB2 


human: AF209747 






potassium large conductance 


mouse: NM_005832 






calcium-activated channel, 






15 


subfamily M, beta member 2 






KCNMB3L 

potassium large conductance 
calcium-activated channel, 
subfamily M, beta member 3- 


human: AP000365 




20 


like 






KCNMB3 

potassium large conductance 
calcium-activated channel 


human: NM_014407, AF214561 






KCNMB4 


human: AJ271372, AF207992, 




25 


potassium large conductance 


RIKEN BB329438, RIKEN 




calcium-activated channel, sub 
M, beta 4 


BB265233 






HCN1 




MGL1096392 




hyperpolarization activated 






30 


cyclic nucleotide-gated 






potassium channel 1 








Cavl.l al LI CACNA1S 


human: L33798, U30707 


MGL88294 




calcium channel, voltage- 








dependent, L type, alpha IS 








subunit 
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Cavl .2 al 1 .2 CACNA1C 


human: Z34815, L29536, Z34822, 




D 


calcium channel, voltage- 


L29534, L04569, Z34817, Z34809, 






dependent, L type, alpha 1C 
subunit 


Z34813, Z34814, Z34820, Z34810, 
Z34811, L29529, Z34819, Z74996 
,Z34812, Z34816, AJ224873, 
Z34818 , Z34821, AF070589, 




1 A 
10 




Z26308, M92269 




Cavl. 3 al 1.3 CACNA1D 
calcium channel, voltage- 
dependent, L type, alpha ID 
subunit 


human: M83566, M76558, D43747, 
AF055575 


MGP88293 




Cavl .4 al 1.4 CACNA1F 


human: AJ224874, AF235097, 


MGI: 1859639 


calcium channel, voltage- 
dependent, L type, alpha IF 
subunit 


AJ006216, AF067227, U93305 






Cav2.1 al 2.1CACNA1A P/Q 


human: U79666, AF004883, 


MGI: 109482 


20 


type calcium channel, voltage- 


AF004884, X99897, AB035727, 




dependent, P/Q type, alpha 1A 
subunit 


U79663, U79665, U79664, 
U79667, U79668, AF 100774 






Cav2.2 al 2.2 CACNA1B 


human: M94172, M94173, U76666 


MGP88296 j 




calcium channel, voltage- 






95 


dependent, L type, alpha IB 
subunit 








Cav2.3 al 2.3 CACNA1E 


human: L29385, L29384, L27745 


MGI: 1062 17 




calcium channel, voltage- 








dependent, alpha IE subunit 








Cav3.1 al 3.1CACNA1G 


human: AB012043, AF190860, 


MGP.1201678 


calcium channel, voltage- 
dependent, alpha 1G subunit 


AF126966, AF227746, AF227744, 
AF 134985, AF227745, AF227747, 
AF126965, AF227749, AF134986, 
AF227748, AF227751, AF227750, 




35 




AB032949, AF029228 
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Cav3.2 <xl 3.2 CACNA1H 


human: AF073931, AF051946, 




5 


calcium channel, voltage- 


AF070604 






dependent, alpha 1H subunit 








Cav3.3 ocl 3.3 CACNA1I 


human: AF142567, AL022319, 






calcium channel, voltage- 


AF211189, AB032946 






dependent, alpha 11 subunit 






10 




TABLE 15 




15 


Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 




NES (nestin) 


no human 


MGI: 101 784 




scip ! 


human: L26494 


MGI: 101 896 


20 


TABLE 16 




Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 




Shh (Sonic Hedgehog) 


human: L3 85 18 \ 


MGI:98297 


25 


Smoothened Shh receptor 


human: U84401, AF1 14821 


MGI: 108075 




Patched Shh binding protein 


human: NM_000264 
rat: AF079162 




30 


TABLE 17 




Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 


35 


CALB1 (calbindin d28 K) 


human: X06661, Ml 9879, 


MGL88248 
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Accession Nnmber 


MGI Database 
Accession 
Number 




CALB2 (calretinin) 


human: NM_001740, X56667, 


MGL101914 


5 




X56668 






PVALB (parvalbumin) 


human: X63578, X63070, Z82184, 
X52695,Z82184 


MGI:97821 


10 


TABLE 18 




Gene 


GenBank and /or UniGene 
Accession Number 


MGI Database 
Accession 
Number 


15 


NTRK2 (Trk B) 


human: U12140, X75958, S76473, 


MGL97384 




S76474 






GFRA1 (GFR alpha 1) 


human: NM_005264 5 AF038420, 
AF038421, U97144, AF042080, 
U95847, AF058999 


MGI: 1100842 




GFRA2 (GFRalpha 2) 


human: U97145, AF002700, U93703 


MGI: 11 95462 


20 


GFRA3 (GFRalpha 3) 


human: AF051767 


MGL1201403 




trka 


human: M23102, X03541, X04201, 


MGI.-97383 




Neurotrophin receptor 


X06704, X62947, M23102, X62947, 
M23102, AB019488, M12128 






trkc 


human: U05012, U05012, S76475, 


MGI:97385 


25 


Neurotrophin receptor 


AJ224521, S76476, AF052184 






ret 


human: S80552 


MGF.97902 




Neurotrophic factor receptor 







All of the sequences identified by the sequence database identifiers in Tables 4-18 
30 are hereby incorporated by reference in their entireties. 

In yet another aspect of the invention, a promoter directs tissue-specific expression 
of the tagged ribosomal protein or mRNA binding protein sequence to which it is operably 
linked. For example, expression of the tagged ribosomal protein or mRNA binding protein 
coding sequences may be controlled by any tissue-specific promoter/enhancer element 
35 known in the art. Promoters that may be used to control expression include, but are not 
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limited to, the following animal transcriptional control regions that exhibit tissue specificity 
and that have been utilized in transgenic animals: elastase I gene control region, which is 
active in pancreatic acinar cells (Swift et aL, 1984, Cell 38:639-646; Ornitz et aL, 1986, 
Cold Spring Harbor Symp. Quant. Biol. 50:399-409; MacDonald, 1987, Hepatology 7:425- 

5 515); enolase promoter, which is active in brain regions, including the striatum, cerebellum, 
CA1 region of the hippocampus, or deep layers of cerebral neocortex (Chen et aL, 1998, 
Molecular Pharmacology 54(3): 495-503); insulin gene control region, which is active in 
pancreatic beta cells (Hanahan, 1985, Nature 315:115-22); immunoglobulin gene control 
region, which is active in lymphoid cells (Grosschedl et aL, 1984, Cell 38:647-58; Adames 

10 et aL, 1985, Nature 318:533-38; Alexander et aL, 1987, Mol. Cell. Biol. 7:1436-44); mouse 
mammary tumor virus control region, which is active in testicular, breast, lymphoid and 
mast cells (Leder et aL, 1986, Cell 45:485-95); albumin gene control region, which is active 
in liver (Pinkert et aL, 1987, Genes and Devel. 1:268-76); alpha-fetoprotein gene control 
region which is active in liver (Krumlauf et aL, 1985, Mol. Cell. Biol. 5:1639-48; Hammer 

15 et aL, 1987, Science 235:53-58); alpha 1-antitrypsin gene control region, which is active in 
the liver (Kelsey etaL, 1987, Genes and Devel. 1:161-71); p-globin gene control region, 
which is active in myeloid cells (Mogram et aL, 1985, Nature 315:338-40; Kollias et aL, 
1986, Cell 46:89-94); myelin basic protein gene control region, which is active in 
oligodendrocyte cells in the brain (Readhead et aL, 1987, Cell 48:703-12); myosin light 

20 chain-2 gene control region, which is active in skeletal muscle (Sani, 1985, Nature 314:283- 
86); and gonadotropic releasing hormone gene control region which is active in the 
hypothalamus (Mason etaL, 1986, Science 234:1372-78). 

In other embodiments, the gene sequence from which the regulatory sequence 
derives is protein kinase C, gamma (GenBank Accession Number: Z151 14 (human); MGI 

25 Database Accession Number: MGL97597); fos (UniGeneNo. MM5043 (mouse)); TH- 
elastin; Pax7 (Mansouri, 1998, The role of Pax3 and Pax7 in development and cancer, Crit. 
Rev. Oncog. 9(2): 141-9); Eph receptor (Mellitzer et aL, 2000, Control of cell behaviour by 
signalling through Eph receptors and ephrins; Curr. Opin. Neurobiol. 10(3):400-08; Suda et 
aL, 2000, Hematopoiesis and angiogenesis, Int. J. Hematol. 71(2):99-107; Wilkinson, 2000, 

30 Eph receptors and ephrins: regulators of guidance and assembly, Int. Rev. Cytol. 

196:177-244; Nakamoto, 2000, Eph receptors and ephrins, Int. J. Biochem. Cell Biol. 
32(1):7-12; Tallquist et aL, 1999, Growth factor signaling pathways in vascular 
development, Oncogene 18(55):7917-32); islet- 1 (Bang etaL, 1996, Regulation of 
vertebrate neural cell fate by transcription factors, Curr. Opin. Neurobiol. 6(l):25-32; 

35 Ericson et aL , 1 995, Sonic hedgehog: a common signal for ventral patterning along the 
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rostrocaudal axis of the neural tube, J. Dev. Biol. 39(5):809-16; p-actin; thy-1 (Caroni, 
1997, Overexpression of growth-associated proteins in the neurons of adult transgenic mice, 
J. Neurosci. Methods 71(l):3-9). 

Nucleic acids of the invention may include all or a portion of the upstream 

5 regulatory sequences of the selected gene. The characterizing gene regulatory sequences 
preferably direct expression of the tagged ribosomal protein or mRNA binding protein 
sequences in substantially the same pattern as the endogenous characterizing gene within 
transgenic organism, or tissue derived therefrom. 

In certain embodiments, the nucleic acids encoding the molecularly tagged 

10 ribosomal proteins or mRNA binding proteins may be selectively expressed in random but 
distinct subsets of cells, as described in Feng et al (2000, Imaging neuronal subsets in 
transgenic mice expressing multiple spectral variants of GFP, Neuron 28(1):41-51, which is 
hereby incorporated by reference in its entirety). Using such methods, independently 
generated transgenic lines may express the nucleic acids encoding the molecularly tagged 

15 ribosomal proteins or mRNA binding proteins in a unique pattern, even though all 
incorporate identical regulatory elements. 

5.6. INTRODUCTION OF VECTORS INTO HOST CELLS 

In one aspect of the invention, a vector containing the nucleic acid encoding the 

20 tagged ribosomal protein or tagged mRNA binding protein and regulatory sequences 

(preferably characterizing gene regulatory sequences) can be introduced transiently or stably 
into the genome of a host cell or be maintained episomally. In another aspect of the 
invention, the vector can be transiently transfected wherein it is not integrated, but is 
maintained as an episome. 

25 The terms "host cell" and "recombinant host cell" are used interchangeably herein. 

It is understood that such terms refer not only to the particular subject cell but to the 
progeny or potential progeny of such a cell. Because certain modifications may occur in 
succeeding generations due to either mutation or environmental influences, such progeny 
may not, in fact, be identical to the parent cell, but are still included within the scope of the 

30 term as used herein. 

A host cell can be any prokaryotic {e.g., bacterium such as E. coli) or eukaryotic cell 
(e.g., a cell from a yeast, plant, insect (e.g., Drosophila), amphibian, amniote, or mammal, 
to name but a few), preferably a vertebrate cell, more preferably a mammalian cell, and 
most preferably, a mouse cell. In certain embodiments, the host cell is a human cell, either 

35 a cultured cell, or in certain embodiments, an immortalized cultured cell or primary human 
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cell. In specific embodiments, the host cells are human embryonic stem cells, or other 
human stem cells (or murine stem cells, including embryonic stem cells), tumor cells or 
cancer cells (particularly circulating cancer cells such as those resulting from leukemias and 
other blood system cancers). Host cells intended to be part of the invention include ones 

5 that comprise nucleic acids encoding one or more tagged ribosomal or tagged mRNA 
binding proteins and, optionally, operably associated with characterizing gene sequences 
that have been engineered to be present within the host cell (e.g. , as part of a vector). The 
invention encompasses genetically engineered host cells that contain any of the foregoing 
tagged ribosomal protein or tagged mRNA binding protein coding sequences, optionally 

10 operatively associated with a regulatory element (preferably from a characterizing gene, as 
described above) that directs the expression of the coding sequences in the host cell. Both 
cDNA and genomic sequences can be cloned and expressed. In a preferred aspect, the host 
cell is recombination deficient, i.e., Rec", and used for BAG recombination. In specific 
embodiments the host cell may contain more than one type of ribosomal or mRNA binding 

15 protein fusion, where the fusion of the different ribosomal and mRNA binding proteins is to 
the same or different peptide tags. 

A vector containing a nucleotide sequence of the invention can be introduced into 
the desired host cell by methods known in the art, e.g., transfection, transformation, 
transduction, electroporation, infection, microinjection, cell fusion, DEAE dextran, calcium 

20 phosphate precipitation, liposomes, LIPOFECTIN™ (source), lysosome fusion, synthetic 
cationic lipids, use of a gene gun or a DNA vector transporter, such that the nucleotide 
sequence is transmitted to offspring in the line. For various techniques for transformation 
or transfection of mammalian cells, see Keown et al, 1990, Methods Enzymol. 185: 527- 
37; Sambrook et al, 2001, Molecular Cloning, A Laboratory Manual, Third Edition, Cold 

25 Spring Harbor Laboratory Press, N. Y. 

In certain embodiments, the vector is introduced into a cultured cell. In other 
embodiments, the vector is introduced into a proliferating cell (or population of cells), e.g., 
a tumor cell, a stem cell, a blood cell, a bone marrow cell, a cell derived from a tissue 
biopsy, etc. 

30 Particularly preferred embodiments of the invention encompass methods of 

introduction of the vector containing the nucleic acid of the invention, using pronuclear 
injection of a nucleic acid construct of the invention into the mononucleus of a mouse 
embryo and infection with a viral vector comprising the construct. Methods of pronuclear 
injection into mouse embryos are well-known in the art and described in Hogan et al. 1986, 

35 Manipulating the Mouse Embryo, Cold Spring Harbor Laboratory Press, New York, NY 
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and Wagner et al, U.S. Patent No. 4,873,191, issued October 10, 1989, herein incorporated 
by reference in their entireties. 

In preferred embodiments, a vector containing the nucleic acid of the invention is 
introduced into any genetic material which ultimately forms a part of the nucleus of the 

5 zygote of the animal to be made transgenic, including the zygote nucleus. In one 
embodiment, the nucleic acid of the invention can be introduced in the nucleus of a 
primordial germ cell which is diploid, e.g., a spermatogonium or oogonium. The primordial 
germ cell is then allowed to mature to a gamete which is then united with another gamete or 
source of a haploid set of chromosomes to form a zygote. In another embodiment, the 

10 vector containing the nucleic acid of the invention is introduced in the nucleus of one of the 
gametes, e.g., a mature sperm, egg or polar body, which forms a part of the zygote. In 
preferred embodiments, the vector containing the nucleic acid of the invention is introduced 
in either the male or female pronucleus of the zygote. More preferably, it is introduced in 
either the male or the female pronucleus as soon as possible after the sperm enters the egg. 

15 In other words, right after the formation of the male pronucleus when the pronuclei are 
clearly defined and are well separated, each being located near the zygote membrane. 

In a most preferred embodiment, the vector containing the nucleic acid of the 
invention is added to the male DNA complement, or a DNA complement other than the 
DNA complement of the female pronucleus, of the zygote prior to its being processed by the 

20 ovum nucleus or the zygote female pronucleus. In an alternate embodiment, the vector 

containing the nucleic acid of the invention could be added to the nucleus of the sperm after 
it has been induced to undergo decondensation. Additionally, the vector containing the 
transgene may be mixed with sperm and then the mixture injected into the cytoplasm of an 
unfertilized egg. Perry et al , 1999, Science 284: 1 180-1183. Alternatively, the vector may 

25 be injected into the vas deferens of a male mouse and the male mouse mated with normal 
estrus females. Huguet et al, 2000, Mol. Reprod. Dev. 56:243-247. 

Preferably, the nucleic acid of the invention is introduced using any technique so 
long as it is not destructive to the cell, nuclear membrane or other existing cellular or 
genetic structures. The nucleic acid of the invention is preferentially inserted into the 

30 nucleic genetic material by microinjection. Microinjection of cells and cellular structures is 
known and is used in the art. Also known in the art are methods of transplanting the 
embryo or zygote into a pseudopregnant female where the embryo is developed to term and 
the nucleic acid of the invention is integrated and expressed. See, e.g., Hogan et al. 1986, 
Manipulating the Mouse Embryo, Cold Spring Harbor Laboratory Press, New York, NY. 

3 5 Viral methods of inserting nucleic acids are known in the art. 
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For stable tratisfection of cultured mammalian cells, only a small fraction of cells 
may integrate the foreign DNA into their genome. The efficiency of integration depends 
upon the vector and transfection technique used. In order to identify and select integrants, a 
gene that encodes a selectable marker (e.g., for resistance to antibiotics) is generally 

5 introduced into the host cells along with a nucleotide sequence of the invention. Preferred 
selectable markers include those which confer resistance to drugs, such as G418, 
hygromycin and methotrexate. Cells stably transfected with the introduced nucleic acid can 
be identified by drug selection (e.g. , cells that have incorporated the selectable marker gene 
will survive, while the other cells die). Such methods are particularly useful in methods 

10 involving homologous recombination in mammalian cells (e.g. , in murine ES cells) prior to 
introducing the recombinant cells into mouse embryos to generate chimeras. 

A number of selection systems may be used to select transformed host cells. In 
particular, the vector may contain certain detectable or selectable markers. Other methods 
of selection include but are not limited to selecting for another marker such as: the herpes 

1 5 simplex virus thymidine kinase (Wigler et al , 1977, Cell 1 1 : 223), hypoxanthine-guanine 
phosphoribosyltransferase (Szybalska and Szybalski, 1962, Proc. Natl. Acad. Sci. USA 48: 
2026), and adenine phosphoribosyltransferase (Lowy et al, 1980, Cell 22: 817) genes can 
be employed in tk-, hgprt- or aprt- cells, respectively. Also, antimetabolite resistance can be 
used as the basis of selection for the following genes: dhfr, which confers resistance to 

20 methotrexate (Wigler et al, 1980, Natl. Acad. Sci. USA 77: 3567; O'Hare et al, 1981, Proc. 
Natl. Acad. Sci. USA 78: 1527); gpt, which confers resistance to mycophenolic acid 
(Mulligan and Berg, 1981, Proc. Natl. Acad. Sci. USA 78: 2072); neo, which confers 
resistance to the aminoglycoside G-418 (Colberre-Garapin et al, 1981, J. Mol. Biol. 150: 
1); and hygro, which confers resistance to hygromycin (Santerre et al, 1984, Gene 30: 147). 

25 

5.7. METHODS OF PRODUCING TRANSFORMED ORGANISMS 

The nucleic acid of the invention may integrate into the genome of the founder 
organism (or an oocyte or embryo that gives rise to the founder organism), preferably by 
random integration. If random, the integration preferably does not knock out, e.g., insert 
30 into, an endogenous gene(s) such that the endogenous gene is not expressed or is mis- 
expressed. 

In other embodiments, the nucleic acid of the invention may integrate by a directed 
method, e.g., by directed homologous recombination ("knock-in"), Chappel, U.S. Patent 
No. 5,272,071; and PCT publication No. WO 91/06667, published May 16, 1991; U.S. 
35 Patent 5,464,764; Capecchi et al, issued November 7, 1995; U.S. Patent 5,627,059, 
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Capecchi et ah issued, May 6, 1997; U.S. Patent 5,487,992, Capecchi et al> issued January 
30, 1996). Preferably, when homologous recombination is used, it does not knock out or 
replace the host's endogenous copy of the characterizing gene (or characterizing gene 
ortholog). 

5 Methods for generating cells having targeted gene modifications through 

homologous recombination are known in the art. The construct will comprise at least a 
portion of the characterizing gene with a desired genetic modification, e.g., insertion of the 
nucleotide sequence coding for the tagged ribosomal protein and will include regions of 
homology to the target locus, i.e., the endogenous copy of the characterizing gene in the 

10 host's genome. DNA constructs for random integration need not include regions of 

homology to mediate recombination. Markers can be included for performing positive and 
negative selection for insertion of the nucleic acid of the invention. 

To create a homologous recombinant organism, a homologous recombination vector 
is prepared in which the nucleotide sequence encoding the tagged ribosomal protein is 

15 flanked at its 5 f and 3* ends by characterizing gene sequences to allow for homologous 
recombination to occur between the exogenous gene carried by the vector and the 
endogenous characterizing gene in an embryonic stem cell. The additional flanking nucleic 
acid sequences are of sufficient length for successful homologous recombination with the 
endogenous characterizing gene. Typically, several kilobases of flanking DNA (both at the 

20 5 f and 3 f ends) are included in the vector. Methods for constructing homologous 
recombination vectors and homologous recombinant animals are described further in 
Thomas and Capecchi, 1987, Cell 51: 503; Bradley, 1991, Curr. Opin. Bio/Technol. 2: 823- 
29; and PCT Publication Nos. WO 90/1 1354, WO 91/01 140, WO 92/0968, and WO 
93/04169. 

25 A transgenic animal is a non-human animal, preferably a mammal, more preferably 

a rodent such as a rat or mouse, in which one or more of the cells of the animal includes a 
nucleic acid of the invention, i.e., has a non-endogenous (i.e., heterologous) nucleic acid 
sequence present as an extrachromosomal element in a portion of its cell or stably integrated 
into its germ line DNA (i.e., in the genomic sequence of most or all of its cells). Other 

30 examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, 
chickens, amphibians, etc. The invention also includes transgenic plants and fungi 
(including yeast). Unless otherwise indicated, it will be assumed that a transgenic animal 
comprises stable changes to the germline sequence. Heterologous nucleic acid is introduced 
into the germ line of such a transgenic animal by genetic manipulation of, for example, 

35 embryos or embryonic stem cells of the host animal. 



89 



WO 03/038049 



PCT/US02/34645 



As discussed above, transformed organisms of the invention, e.g., transgenic 
animals, are preferably generated by random integration of a vector containing a nucleic 
acid of the invention into the genome of the organism, for example, by pronuclear injection 
in an animal zygote as described above. Other methods involve introducing the vector into 

5 cultured embryonic cells, for example ES cells, and then introducing the transformed cells 
into animal blastocysts, thereby generating a "chimeras" or "chimeric animals", in which 
only a subset of cells have the altered genome. Chimeras are primarily used for breeding 
purposes in order to generate the desired transgenic animal. Animals having a heterozygous 
alteration are generated by breeding of chimeras. Male and female heterozygotes are 

10 typically bred to generate homozygous animals. 

A homologously recombinant organism may include, but is not limited to, a 
recombinant animal, such as a non-human animal, preferably a mammal, more preferably a 
mouse, in which an endogenous gene has been altered by homologous recombination 
between the endogenous gene and an exogenous DNA molecule introduced into a cell of the 

15 animal, e.g., an embryonic cell of the animal, prior to development of the animal. 

In a preferred embodiment, a transgenic animal of the invention is created by 
introducing a nucleic acid of the invention, encoding the characterizing gene regulatory 
sequences operably linked to nucleotide sequences encoding a tagged ribosomal protein, 
into the male pronuclei of a fertilized oocyte, e.g., by microinjection or retroviral infection, 

20 and allowing the oocyte to develop in a pseudopregnant female foster animal. Methods for 
generating transgenic animals via embryo manipulation and microinjection, particularly 
animals such as mice, have become conventional in the art and are described, for example, 
in U.S. Patent Nos. 4,736,866 and 4,870,009, U.S. Patent No. 4,873,191, inHogan, 
Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring 

25 Harbor, N.Y., 1986) and in Wakayama etal, 1999, Proc. Natl. Acad. ScL USA, 96:14984- 
89. Similar methods are used for production of other transgenic animals. 

A transgenic founder animal can be identified based upon the presence of the nucleic 
acid of the invention in its genome and/or expression of mRNA encoding the nucleic acid of 
the invention in tissues or cells of the animals. A transgenic founder animal can then be 

30 used to breed additional animals carrying the nucleic acid of the invention as described 
supra. Moreover, transgenic animals carrying the nucleic acid of the invention can further 
be bred to other transgenic animals carrying other nucleic acids of the invention. 

In another embodiment, the nucleic acid of the invention is inserted into the genome 
of an embryonic stem (ES) cell, followed by injection of the modified ES cell into a 

35 
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blastocyst-stage embryo that subsequently develops to maturity and serves as the founder 

animal for a line of transgenic animals. 

In another embodiment, a vector bearing a nucleic acid of the invention is 

introduced into ES cells (e.g., by electroporation) and cells in which the introduced gene has 
5 homologously recombined with the endogenous gene are selected. See, e.g., Li et al, 1992, 

Cell 69:915. For embryonic stem (ES) cells, an ES cell line may be employed, or 

embryonic cells may be obtained freshly from a host, e.g. mouse, rat, guinea pig, etc. 
After transformation, ES cells are grown on an appropriate feeder layer, e.g., a 

fibroblast-feeder layer, in an appropriate medium and in the presence of appropriate growth 
10 factors, such as leukemia inhibiting factory (LIF). Cells that contain the construct may be 

detected by employing a selective medium. Transformed ES cells may then be used to 

produce transgenic animals via embryo manipulation and blastocyst injection. (See, e.g., 

U.S. Pat. Nos. 5,387,742, 4,736,866 and 5,565,186 for methods of making transgenic 

animals.) 

15 Stable expression of the construct is preferred. For example, ES cells that stably 

express a nucleotide sequence encoding a tagged ribosomal protein may be engineered. 
Rather than using vectors that contain viral origins of replication, ES host cells can be 
transformed with DNA, e.g., a plasmid, controlled by appropriate expression control 
elements (e.g., promoter, enhancer, sequences, transcription terminators, polyadenylation 

20 sites, etc.), and a selectable marker. Following the introduction of the foreign DNA, 

engineered ES cells may be allowed to grow for 1-2 days in an enriched media, and then are 
switched to a selective media. The selectable marker in the recombinant plasmid confers 
resistance to the selection and allows cells to stably integrate the plasmid into their 
chromosomes and expanded into cell lines. This method may advantageously be used to 

25 engineer ES cell lines that express a nucleotide sequence encoding a tagged ribosomal 
protein. 

The selected ES cells are then injected into a blastocyst of an animal (e.g., a mouse) 
to form aggregation chimeras. See, e.g., Bradley, 1987, in Teratocarcinomas and Embryonic 
Stem Cells: A Practical Approach, Robertson, ed., IRL, Oxford, 1 13-52. Blastocysts are 
30 obtained from 4 to 6 week old superovulated females. The ES cells are trypsinized, and the 
modified cells are injected into the blastocoel of the blastocyst. After injection, the 
blastocysts are implanted into the uterine horns of suitable pseudopregnant female foster 
animal. Alternatively, the ES cells may be incorporated into a morula to form a morula 
aggregate which is then implanted into a suitable pseudopregnant female foster animal. 

35 

91 



WO 03/038049 



PCT/US02/34645 



Females are then allowed to go to term and the resulting litters screened for mutant cells 
having the construct. 

The chimeric animals are screened for the presence of the modified gene. By 
providing for a different phenotype of the blastocyst and the ES cells, chimeric progeny can 

5 be readily detected. Males and female chimeras having the modification are mated to 
produce homozygous progeny. Only chimeras with transformed germline cells will 
generate homozygous progeny. If the gene alterations cause lethality at some point in 
development, tissues or organs can be maintained as allergenic or congenic grafts or 
transplants, or in in vitro culture. 

1 0 Progeny harboring homologously recombined or integrated DNA in their germline 

cells can be used to breed animals in which all cells of the animal contain the homologously 
recombined DNA by germline transmission of the nucleic acid of the invention. 

Clones of the non-human transgenic animals described herein can also be produced 
according to the methods described in Wilmut et al 9 1997, Nature 385: 810-13 and PCT 

15 Publication NOS. WO 97/07668 and WO 97/07669. 

Once the transgenic mice are generated they may be bred and maintained using 
methods well known in the art. By way of example, the mice may be housed in an 
environmentally controlled facility maintained on a 10 hour dark: 14 hour light cycle. Mice 
are mated when they are sexually mature (6 to 8 weeks old). In certain embodiments, the 

20 transgenic founders or chimeras are mated to an unmodified animal (i.e., an animal having 
no cells containing the nucleic acid of the invention). In a preferred embodiment, the 
transgenic founder or chimera is mated to C57BL/6 mice (Jackson Laboratories). In a 
specific embodiment where the nucleic acid of the invention is introduced into ES cells and 
a chimeric mouse is generated, the chimera is mated to 129/Sv mice, which have the same 

25 genotype as the embryonic stem cells. Protocols for successful creation and breeding of 
transgenic mice are known in the art (Manipulating the Mouse Embryo. A Laboratory 
Manual, 2nd edition. B. Hogan, Beddington, R., Costantini, F. and Lacy, E., eds. 1994. Cold 
Spring Harbor Laboratory Press: Plainview, NY). Preferably, a founder male is mated with 
two females and a founder female is mated with one male. Preferably two females are 

30 rotated through a male's cage every 1-2 weeks. Pregnant females are housed 1 or 2 per 
cage. Preferably, pups are ear tagged, genotyped, and weaned at 21 days. Males and 
females are housed separately. Preferably log sheets are kept for any mated animal, by 
example and not limitation, information should include pedigree, birth date, sex, ear tag 
number, source of mother and father, genotype, dates mated and generation. 

35 
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More specifically, founder animals heterozygous for the nucleic acid of the 
invention may be mated to generate a homozygous line as follows: A heterozygous founder 
animal, designated as the P x generation, is mated with an offspring from a mating with a 
non-transgenic mouse, designated as the F l generation, transgenic mouse of the opposite sex 

5 which is heterozygous for the nucleic acid of the invention (backcross). Based on classical 
genetics, one fourth of the results of this backcross are homozygous for the nucleic acid of 
the invention. In a preferred embodiment, transgenic founders are individually backcrossed 
to an inbred or outbred strain of choice. Different founders should not be intercrossed, 
since different expression patterns may result from separate nucleic acid integration events. 

1 0 The determination of whether a transgenic mouse is homozygous or heterozygous 

for the nucleic acid of the invention is as follows: 

An offspring of the above described breeding cross is mated to a normal control 
non-transgenic animal. The offspring of this second mating are analyzed for the presence of 
the nucleic acid of the invention by the methods described below. If all offspring of this 

1 5 cross test positive for the nucleic acid of the invention, the mouse in question is 
homozygous for the nucleic acid of the invention. If, on the other hand, some of the 
offspring test positive for the nucleic acid of the invention and others test negative, the 
mouse in question is heterozygous for the nucleic acid of the invention. 

An alternative method for distinguishing between a transgenic animal which is 

20 heterozygous and one which is homozygous for the nucleic acid of the invention is to 

measure the intensity with radioactive probes following Southern blot analysis of the DNA 
of the animal. Animals homozygous for the nucleic acid of the invention would be expected 
to produce higher intensity signals from probes specific for the nucleic acid of the invention 
than would heterozygote transgenic animals. 

25 In a preferred embodiment, the transgenic mice are so highly inbred to be genetically 

identical except for sexual differences. The homozygotes are tested using backcross and 
intercross analysis to ensure homozygosity. Homozygous lines for each integration site in 
founders with multiple integrations are also established. Brother/sister matings for 20 or 
more generations define an inbred strain. In another preferred embodiment, the transgenic 

30 lines are maintained as hemizygotes. 

In an alternative embodiment, individual genetically altered mouse strains are also 
cryopreserved rather than propagated. Methods for freezing embryos for maintenance of 
founder animals and transgenic lines are known in the art. Gestational day 2.5 embryos are 
isolated and cryopreserved in straws and stored in liquid nitrogen. The first straw and the 

35 last straw are subsequently thawed and transferred to foster females to demonstrate viability 
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of the line with the assumption that all embryos frozen between the first straw and the last 
straw will behave similarly. If viable progeny are not observed a second embryo transfer 
will be performed. Methods for reconstituting frozen embryos and bringing the embryos to 
term are known in the art. 

5 The nucleic acid encoding the molecularly tagged ribosomal protein or mRNA 

binding protein may be introduced into the genome of a founder plant (or embryo that gives 
rise to the founder plant) using methods well known in the art (Newell, 2000, Plant 
transformation technology. Developments and applications, Mol. Biotechnol. 16(l):53-65; 
Kumar and Fladung, 2001, Controlling transgene integration in plants, Trends in Plant 

10 Science 6 (4): 155-159). The nucleic acid encoding the molecularly tagged ribosomal 
protein or mRNA binding protein may be introduced into the genome of bacteria and yeast 
using methods described in Ausubel et al, 1989, Current Protocols in Molecular Biology, 
Green Publishing Associates and Wiley Interscience, N.Y., Chapters 1 and 13, respectively). 

1 5 5.7.1. HOMOLOGOUS RECOMBINATION IN BACTERIAL 

ARTIFICIAL CHROMOSOMES 

The invention provides transformed organisms, e.g., transgenic mice, that express a 
tagged ribosomal protein within a chosen cell type (see infra). In preferred embodiments, 
BAC-mediated recombination (Yang, et al 9 1997, Nat. Biotechnol. 15(9):859-865) is used 
to create the transformed organism. Such expression is achieved by using the endogenous 
regulatory sequences of a particular gene, wherein the expression of gene is a defining 
characteristic of the chosen cell type (as also described in PCT/US02/04765, entitled 
"Collections of Transgenic Animal Lines (Living Library)" by Serafini, published as WO 
02/064749 on August 22, 2002, which is incorporated by reference herein in its entirety). In 

25 

another preferred embodiment, a collection of transgenic mice expressing tagged ribosomal 
proteins within a set of chosen cell types is assembled, as described infra. 

Vectors used in the methods of the invention preferably can accommodate, and in 
certain embodiments comprise, large pieces of heterologous DNA such as genomic 
sequences. Such vectors can contain an entire genomic locus, or at least sufficient sequence 

30 

to confer endogenous regulatory expression pattern and to insulate the expression of coding 
sequences from the effect of regulatory sequences surrounding the site of integration of the 
nucleic acid of the invention in the genome to mimic better wild type expression. When 
entire genomic loci or significant portions thereof are used, few, if any, site-specific 
expression problems of a nucleic acid of the invention are encountered, unlike insertions of 

35 

nucleic acids into smaller sequences. In a preferred embodiment, the vector is a BAG 
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containing genomic sequences into which a selected sequence encoding a molecular tag, 
e.g., an epitope tag, has been inserted by directed homologous recombination in bacteria, 
e.g., by the methods of Heintz WO 98/59060; Heintz et al, WO 01/05962; Yang et al. } 
1997, Nature Biotechnol. 15: 859-865; Yang et al, 1999, Nature Genetics 22: 327-35; 

5 which are incorporated herein by reference in their entireties. 

Using such methods, a BAG can be modified directly in a recombination-deficient 
E. coli host strain by homologous recombination. 

In a preferred embodiment, homologous recombination in bacteria is used for target- 
directed insertion of a sequence encoding a molecularly tagged ribosomal protein into the 
10 genomic DNA encoding sufficient regulatory sequences (termed "characterizing gene 
sequences") to promote expression of the tagged ribosomal protein in the endogenous 
expression pattern of the characterizing gene, which sequences have been inserted into the 
BAC. The BAG comprising the molecularly tagged ribosomal protein sequence under the 
regulation of this characterizing gene sequence is then recovered and introduced into the 

1 5 genome of a potential founder organism for a line of transformed organisms. 

Preferably, the tagged ribosomal protein encoding sequence is inserted into the 
characterizing gene sequences using 5' direct fusion without the use of an IRES, i.e., such 
that the tagged ribosomal protein encoding sequence(s) is fused directly in frame to the 
nucleotide sequence encoding at least the first codon of the characterizing gene coding 

20 sequence and even the first two, four, five, six, eight, ten or twelve codons. In other 

embodiments, the tagged ribosomal protein encoding sequence is inserted into the 3' UTR 
of the characterizing gene and has its own IRES. In yet another specific embodiment, the 
tagged ribosomal protein encoding sequence is inserted into the 5' UTR of the 
characterizing gene with an IRES controlling the expression of the tagged ribosomal protein 

25 encoding sequence. 

In a preferred aspect of the invention, the molecularly tagged ribosomal protein 
encoding sequence is introduced into a BAC containing characterizing gene regulatory 
sequences by the methods of Heintz et al. WO 98/59060 and Heintz et al, WO 01/05962, 
both of which are incorporated herein by reference in their entireties. The molecularly 

30 tagged sequence is introduced by performing selective homologous recombination on a 
particular nucleotide sequence contained in a recombination deficient host cell, ie.,a. cell 
that cannot independently support homologous recombination, e.g., Rec A*. The method 
preferably employs a recombination cassette that contains a nucleic acid containing the 
molecular-tag coding sequence that selectively integrates into a specific site in the 

35 characterizing gene by virtue of sequences homologous to the characterizing gene flanking 
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the molecular-tag gene coding sequences on the shuttle vector when the recombination 
deficient host cell is induced to support homologous recombination (for example by 
providing a functional RecA gene on the shuttle vector used to introduce the recombination 
cassette). 

5 In a preferred aspect, the particular nucleotide sequence that has been selected to 

undergo homologous recombination is contained in an independent origin based cloning 
vector introduced into or contained within the host cell, and neither the independent origin 
based cloning vector alone, nor the independent origin based cloning vector in combination 
with the host cell, can independently support homologous recombination {e.g., is RecA"). 

10 Preferably, the independent origin based cloning vector is a BAC or a bacteriophage-derived 
artificial chromosome (BBPAC) and the host cell is a host bacterium, preferably E. coll 

In another preferred aspect, sufficient characterizing gene sequences flank the tagged 
ribosomal protein encoding sequence to accomplish homologous recombination and target 
the insertion of the molecularly tagged ribosomal protein coding sequences to a particular 

15 location in the characterizing gene. The tagged ribosomal protein coding sequence and the 
homologous characterizing gene sequences are preferably present on a shuttle vector 
containing appropriate selectable markers and the RecA gene, optionally with a temperature 
sensitive origin of replication (see Heintz et al WO 98/59060 and Heintz et al 9 WO 
01/05962 such that the shuttle vector only replicates at the permissive temperature and can 

20 be diluted out of the host cell population at the non-permissive temperature. When the 
shuttle vector is introduced into the host cell containing the BAC, the RecA gene is 
expressed and recombination of the homologous shuttle vector and BAC sequences can 
occur, thus targeting the tagged ribosomal protein encoding sequence (along with the shuttle 
vector sequences and flanking characterizing gene sequences) to the characterizing gene 

25 sequences in the BAC. 

The BACs can be selected and screened for integration of the molecularly tagged 
ribosomal protein coding sequences into the selected site in the characterizing gene 
sequences using methods well known in the art (e.g., methods described in Section 5, infra, 
and in Heintz et al., WO 98/59060 entitled "Methods of preforming (sic) homologous 
30 recombination based modification of nucleic acids in recombination deficient cells and use 
of the modified nucleic acid products thereof," and Heintz et al 9 WO 01/05962, entitled 
"Conditional homologous recombination of large genomic vector inserts"). Optionally, the 
shuttle vector sequences not containing the molecularly tagged ribosomal protein coding 
sequences (including the RecA gene and any selectable markers) can be removed from the 

35 
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BAG by resolution as described in Section 5 and in Heintz et al WO 98/59060 and Heintz 
et al, WO 01/05962. 

If the shuttle vector contains a negative selectable marker, cells can be selected for 
loss of the shuttle vector sequences. In an alternative embodiment, the functional RecA 

5 gene is provided on a second vector and removed after recombination, e.g. , by dilution of 
the vector or by any method known in the art. The exact method used to introduce the 
tagged ribosomal protein encoding sequence and to remove (or not) the RecA (or other 
appropriate recombination enzyme) will depend upon the nature of the BAG library used 
(for example, the selectable markers present on the B AC vectors) and such modifications 

10 are within the skill in the art. 

Once the BAC containing the characterizing gene regulatory sequences and 
molecularly tagged ribosomal protein coding sequences in the desired configuration is 
identified, it can be isolated from the host E. coli cells using routine methods and used to 
make transformed organisms as described infra). 

1 5 BACs to be used in the methods of the invention are selected and/or screened using 

the methods described supra . 

Alternatively, the BAC can also be engineered or modified by "E-T cloning," as 
described by Muyrers et al (1999, Nucleic Acids Res. 27(6): 1555-57, incorporated herein 
by reference in its entirety). Using these methods, specific DNA may be engineered into a 

20 BAC independently of the presence of suitable restriction sites. This method is based on 
homologous recombination mediated by the recE and recT proteins ("ET-cloning") (Zhang 
et al, 1998, Nat. Genet. 20(2): 123-28; incorporated herein by reference in its entirety). 
Homologous recombination can be performed between a PGR fragment flanked by short 
homology arms and an endogenous intact recipient such as a BAC. Using this method, 

25 homologous recombination is not limited by the disposition of restriction endonuclease 

cleavage sites or the size of the target DNA. A BAC can be modified in its host strain using 
a plasmid, e.g. , pBAD-aPy, in which recE and recT have been replaced by their respective 
functional counterparts of phage lambda (Muyrers et al, 1999, Nucleic Acids Res. 27(6): 
1555-57). Preferably, a BAC is modified by recombination with a PCR product containing 

30 homology arms ranging from 27-60 bp. In a specific embodiment, homology arms are 50 
bp in length. 

In another embodiment, a nucleic acid of the invention is inserted into a yeast 
artificial chromosome (YAC) (Burke et al, 1987 Science 236: 806-12; and Peterson et al, 
1997, Trends Genet. 13: 61). 

35 
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In other embodiments, the nucleic acid of the invention is inserted into another 
vector developed for the cloning of large segments of mammalian DNA, such as a cosmid 
or bacteriophage PI (Sternberg etal, 1990, Proc. Natl. Acad. Sci. USA 87: 103-07). The 
approximate maximum insert size is 30-35 kb for cosmids and 100 kb for bacteriophage PI. 
5 In another embodiment, the nucleic acid of the invention is inserted into a P-l 

derived artificial chromosome (PAC) (Mejia et al 9 1997, Retrofitting vectors for 
Escherichia coli-based artificial chromosomes (PACs and BACs) with markers for 
transfection studies, Genome Res. 7(2): 179-86). The maximum insert size is 300 kb. 

1 0 5.8. METHODS OF SCREENING FOR EXPRESSION OF NUCLEIC 

ACIDS OF THE INVENTION 

Potential founder organisms for a line of transformed organisms can be screened for 
expression of the tagged ribosomal protein gene sequence in the population of cells 
characterized by expression of the endogenous characterizing gene. 

Transformed organisms that exhibit appropriate expression (e.g. , detectable 
expression having substantially the same expression pattern as the endogenous 
characterizing gene in a corresponding non-transgenic organism or anatomical region 
thereof, i.e., detectable expression in at least 80%, 90% or, preferably, 95% of the cells 
shown to express the endogenous gene by in situ hybridization) are selected as lines of 
transformed organisms. 

hi a preferred embodiment, immunohistochemistry using an antibody specific for the 
epitope tag is used to detect expression of the tagged ribosomal fusion protein product. 

5.9. EXPRESSION OF A TAGGED RIBOSOMAL PROTEIN IN A 
25 POPULATION OF CELLS 

The nucleic acid of the invention containing the nucleotide sequence encoding the 
tagged ribosomal protein can be expressed in the cell type of interest using methods well 
known in the art for recombinant gene expression. The choice of which method to use to 
2Q express a DNA sequence encoding a tagged ribosomal protein in a chosen population of 
cells depends upon the population. 

In certain embodiments, the chosen population of cells is a particular population of 
cells in culture that have been transfected with the construct encoding the tagged ribosomal 
protein, the expression construct is chosen to allow efficient and high-level expression in 
the type of cells present in culture, with the mRNA of the transfected population being 
isolated according to the methods described herein. 
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This mode of the invention would be particularly useful if one wanted to study 
global gene expression changes in cultured cells in response to the expression of a particular 
gene product, co-expressed with a tagged ribosomal subunit to allow isolation of mRNA 
from co-expressing cells. 
5 In another embodiment, the expression construct can be contained within a viral 

vector or virus, which is introduced into the desired host cell as described above. This 
embodiment permits study of mRNA populations from transduced or infected cells, in vitro 
or in vivo. 

In another embodiment, expression of the tagged ribosomal protein is driven in 
10 populations of cells by the characterizing gene regulatory element. 

In another embodiment, the gene sequences encoding the characterizing gene 
regulatory element and the tagged ribosomal protein is introduced by homologous 
recombination. 

In another embodiment, homologous recombination is used to introduce only the 

1 5 epitope tag gene coding sequences. 

Methods for selecting for cells containing and expressing the nucleotide sequences 
encoding the fusion proteins of the invention are well known in the art. For example, in 
eukaryotic cells, the nucleotide sequence encoding the fusion protein is associated with (for 
example, present on the same vector as) a selectable marker such as dhfir. Cells having the 

20 dhfir selectable marker are resistant to the drug methotrexate. Increasing levels of 

methotrexate can also lead to amplification of the selectable marker (and, concomitantly, 
the sequence encoding the fusion protein of the invention). Once the selectable marker 
sequences have integrated into the host cell chromosome, the selectable marker sequences 
(and the sequences encoding the fusion protein of the invention) will be maintained by the 

25 host cells even in the absence of selection (e.g., in the absence of methotrexate when the 
selectable marker is dhfr). 

5.10. NUCLEIC ACID CONSTRUCTS 

The invention provides vectors and lines of organisms that contain a nucleic acid 
30 construct, e.g. , a transgene, that comprises the coding sequence for a peptide tag-ribosomal 
fusion protein or peptide tag-mRNA binding protein fusion protein under the control of a 
regulatory sequences for a "characterizing gene." The regulatory sequence is e.g. , an 
endogenous promoter of a characterizing gene. This characterizing gene is endogenous to a 
host cell or host organism (or is an ortholog of an endogenous gene) and is expressed in a 
35 particular select population of cells of the organism. Expression of the nucleic acid 
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construct is such that the nucleic acid construct has substantially the same expression 
pattern as the endogenous characterizing gene. 

A transgene is a nucleotide sequence that has been or is designed to be incorporated 
into a cell, particularly a mammalian cell, that in turn becomes or is incorporated into a 

5 living animal such that the nucleic acid containing the nucleotide sequence is expressed 
(i.e. 9 the mammalian cell is transformed with the transgene). 

The characterizing gene sequence is preferably endogenous to the transformed 
organism, or is an ortholog of an endogenous gene, e.g., the human ortholog of a gene 
endogenous to the animal to be made transgenic. A nucleic acid construct comprising the 

10 tagged ribosomal protein and optionally, the characterizing gene sequence may be present as 
an extrachromosomal element in some or all of the cells of a transformed organisms such as 
a transgenic animal or, preferably, stably integrated into some or all of the cells, more 
preferably into the germ line DNA of the animal (i e. , such that the nucleic acid construct is 
transmitted to all or some of the animal's progeny), thereby directing expression of an 

15 encoded gene product (i.e., the tagged ribosomal protein gene product) in one or more cell 
types or tissues of the transformed organism. Unless otherwise indicated, it will be assumed 
that a transformed organism, e.g., a transgenic animal, comprises stable changes to the 
chromosomes of gemiline cells. In a preferred embodiment, the nucleic acid construct is 
present in the genome at a site other than where the endogenous characterizing gene is 

20 located. In other embodiments, the nucleic acid construct is incorporated into the genome 
of the organism at the site of the endogenous characterizing gene, for example, by 
homologous recombination. 

In certain embodiments, transformed organisms are created by introducing a nucleic 
acid construct of the invention into its genome using methods routine in the art, for 

25 example, the methods described in Section 5.7, supra. A construct is a recombinant nucleic 
acid, generally recombinant DNA, generated for the purpose of the expression of a specific 
nucleotide sequence(s), or is to be used in the construction of other recombinant nucleotide 
sequences. 

A transgenic construct of the invention includes at least the coding region for a 
30 peptide tag fused to the coding region for a ribosomal protein, operably linked to all or a 
portion of the regulatory sequences, e.g. a promoter and/or enhancer, of the characterizing 
gene. The transgenic construct optionally includes enhancer sequences and coding and 
other non-coding sequences (including intron and 5' and 3 f untranslated sequences) from the 
characterizing gene such that the tagged ribosomal fusion protein gene is expressed in the 
35 same subset of cells as the characterizing gene. The tagged ribosomal fusion protein gene 
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coding sequences and the characterizing gene regulatory sequences are operably linked, 
meaning that they are connected in such a way so as to permit expression of the tagged 
ribosomal protein gene when the appropriate molecules (e.g., transcriptional activator 
proteins) are bound to the characterizing gene regulatory sequences. Preferably the linkage 

5 is covalent, most preferably by a nucleotide bond. The promoter region is of sufficient 
length to promote transcription, as described in Alberts et at (1989) in Molecular Biology 
of the Cell, 2d Ed. (Garland Publishing, Inc.). 

In one aspect of the invention, the regulatory sequence is the promoter of a 
characterizing gene. Other promoters that direct tissue-specific expression of the coding 

10 sequences to which they are operably linked are also contemplated in the invention. In 
specific embodiments, a promoter from one gene and other regulatory sequences (such as 
enhancers) from other genes are combined to achieve a particular temporal and spatial 
expression pattern of the tagged ribosomal protein gene. 

Methods that are well known to those skilled in the art can be used to construct 

15 vectors containing tagged ribosomal protein gene coding sequences operatively associated 
with the appropriate transcriptional and translational control signals of the characterizing 
gene. These methods include, for example, in vitro recombinant DNA techniques and in 
vivo genetic recombination. See, for example, the techniques described in Sambrook et al, 
2001, Molecular Cloning, A Laboratory Manual, Third Edition, Cold Spring Harbor 

20 Laboratory Press, N.Y.; and Ausubel et al , 1989, Current Protocols in Molecular Biology, 
Green Publishing Associates and Wiley Interscience, N. Y., both of which are hereby 
incorporated by reference in their entireties. 

The tagged ribosomal protein gene coding sequences may be incorporated into some 
or all of the characterizing gene sequences such that the tagged ribosomal protein gene is 

25 expressed in substantially same expression pattern as the endogenous characterizing gene in 
the transformed organism, or at least in an anatomical region or tissue of the organisms (by 
way of example, in the brain, spinal chord, heart, skin, bones, head, limbs, blood, muscle, 
peripheral nervous system, etc. of an animal) containing the population of cells to be 
marked by expression of the tagged ribosomal protein gene coding sequences. By 

30 "substantially the same expression pattern" is meant that the tagged ribosomal protein gene 
coding sequences are expressed in at least 80%, 85%, 90%, 95%, and preferably 100% of 
the cells shown to express the endogenous characterizing gene by in situ hybridization. 
Because detection of the tagged ribosomal protein gene expression product may be more 
sensitive than in situ hybridization detection of the endogenous characterizing gene 

35 messenger RNA, more cells may be detected to express the tagged ribosomal protein gene 
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product in the transformed organism than are detected to express the endogenous 
characterizing gene by in situ hybridization or any other method known in the art for in situ 
detection of gene expression. 

For example, the nucleotide sequences encoding the tagged ribosomal protein gene 

5 protein product may replace the characterizing gene coding sequences in a genomic clone of 
the characterizing gene, leaving the characterizing gene regulatory non-coding sequences. 
In other embodiments, the tagged ribosomal protein gene coding sequences (either genomic 
or cDNA sequences) replace all or a portion of the characterizing gene coding sequence and 
the nucleotide sequence only contains the upstream and downstream characterizing gene 

10 regulatory sequences. 

In a preferred embodiment, the tagged ribosomal protein gene coding sequences are 
inserted into or replace transcribed coding or non-coding sequences of the genomic 
characterizing gene sequences, for example, into or replacing a region of an exon or of the 3' 
UTR of the characterizing gene genomic sequence. Preferably, the tagged ribosomal 

15 protein gene coding sequences are not inserted into or replace regulatory sequences of the 
genomic characterizing gene sequences. Preferably, the tagged ribosomal protein gene 
coding sequences are also not inserted into or replace characterizing gene intron sequences. 

In a preferred embodiment, the tagged ribosomal protein gene coding sequence is 
inserted into or replaces a portion of the 3' untranslated region (UTR) of the characterizing 

20 gene genomic sequence. In another preferred embodiment, the coding sequence of the 
characterizing gene is mutated or disrupted to abolish characterizing gene expression from 
the nucleic acid construct without affecting the expression of the tagged ribosomal protein 
gene. In certain embodiments, the tagged ribosomal protein gene coding sequence has its 
own internal ribosome entry site (IRES). For descriptions of IRESes, see, e.g., Jackson et 

25 al, 1990, Trends Biochem Sci. 15(12):477-83; Jang et al, 1988, J. Virol. 62(8):2636-43; 
Jang et al, 1990, Enzyme 44(l-4):292-309; and Martinez-Salas, 1999, Curr. Opin. 
Biotechnol. 10(5):458-64. 

In another embodiment, the tagged ribosomal protein gene is inserted at the 3 r end 
of the characterizing gene coding sequence. In a specific embodiment, the tagged ribosomal 

30 protein coding sequences are introduced at the 3' end of the characterizing gene coding 
sequence such that the nucleotide sequence encodes a fusion of the characterizing gene and 
the tagged ribosomal protein gene sequences. 

Preferably, the tagged ribosomal protein gene coding sequences are inserted using 5' 
direct fusion wherein the tagged ribosomal protein gene coding sequences are inserted in- 

35 frame adjacent to the initial ATG sequence (or adjacent the nucleotide sequence encoding 
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the first two, three, four, five, six, seven or eight amino acids of the characterizing gene 
protein product) of the characterizing gene, so that translation of the inserted sequence 
produces a fusion protein of the first methionine (or first few amino acids) derived from the 
characterizing gene sequence fused to the tagged ribosomal protein gene protein. In this 

5 embodiment, the characterizing gene coding sequence 3' of the tagged ribosomal protein 
gene coding sequences are not expressed. In yet another specific embodiment, a tagged 
ribosomal protein gene is inserted into a separate cistron in the 5' region of the 
characterizing gene genomic sequence and has an independent IRES sequence. 

In certain embodiments, an IRES is operably linked to the tagged ribosomal protein 

10 gene coding sequence to direct translation of the tagged ribosomal protein gene. The IRES 
permits the creation of polycistronic mRNAs from which several proteins can be 
synthesized under the control of an endogenous transcriptional regulatory sequence. Such a 
construct is advantageous because it allows marker proteins to be produced in the same cells 
that express the endogenous gene (Heintz, 2000, Hum. Mol. Genet. 9(6): 937-43; Heintz et 

15 al, WO 98/59060; Heintz et al 9 WO 01/05962; which are incorporated herein by reference 
in their entireties). 

Shuttle vectors containing an IRES, such as the pLD53 shuttle vector (see Heintz et 
al, WO 01/05962), may be used to insert the tagged ribosomal protein gene sequence into 
the characterizing gene. The IRES in the pLD53 shuttle vector is derived from EMCV 
20 (encephalomyocarditis virus) (Jackson et al, 1990, Trends Biochem Sci. 15(12):477-83; 
and Jang et al, 1988, J. Virol. 62(8):2636-43, both of which are hereby incorporated by 
reference). The common sequence between the first and second IRES sites in the shuttle 
vector is shown below. This common sequence also matches pIRES (Clontech) from 1158- 
1710. 

25 TAACGTTACTGGCCGAAGCCGCTTGGAATAAGGCCGGTGTGCGTTTGTCTATAT 
GTTATTTTCCACCATATTGCCGTCTTTTGGCAATGTGAGGGCCCGGAAACCTGG 
CCCTGTCTTCTTGACGAGCATTCCTAGGGGTCTTTCCCCTCTCGCCAAAGGAATG 
CAAGGTCTGTTGAATGTCGTGAAGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGA 
CAAACAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAACCCCCCACCTGGCGA 

30 CAGGTGCCTCTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCAAAGGCGG 
CACAACCCCAGTGCCACGTTGTGAGTTGGATAGTTGTGGAAAGAGTCAAATGG 
CTCTCCTAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAGAAGGTACTCCATT 
GTATGGGATCTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGA 
GGTTAAAAAAACGTCTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAA 

35 AAACACCATGATA (SEQIDNO:6) 
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In a specific embodiment, the EMCV IRES is used to direct independent translation 
of the tagged ribosomal protein gene coding sequences (Gorski and Jones, 1999, Nucleic 
Acids Research 27(9):2059-61). 

In another embodiment, more than one IRES site is present in a nucleic acid of the 
5 invention to direct translation of more than one coding sequence. However, in this case, 
each IRES sequence must be a different sequence. 

In certain embodiments where a tagged ribosomal protein gene is expressed 
conditionally, the tagged ribosomal protein gene coding sequence is embedded in the 
genomic sequence of the characterizing gene and is inactive unless acted on by a 
10 transactivator or recombinase, whereby expression of the tagged ribosomal protein gene can 
then be driven by the characterizing gene regulatory sequences. 

In other embodiments the tagged ribosomal protein gene is expressed conditionally, 
through the activity of a gene that is an activator or suppressor of gene expression. In this 
case, the gene encodes a transactivator, e.g., tetR, or a recombinase, e.g., FLP, whose 
1 5 expression is regulated by the characterizing gene regulatory sequences. The tagged 
ribosomal protein gene is linked to a conditional element, e.g., the tet promoter, or is 
flanked by recombinase sites, e.g., FRT sites, and may be located any where within the 
genome. In such a system, expression of the transactivator gene, as regulated by the 
characterizing gene regulatory sequences, activates the expression of the tagged ribosomal 
20 protein gene. 

In certain embodiments, exogenous translational control signals, including, for 
example, the ATG initiation codon, can be provided by the characterizing gene or some 
other heterologous gene. The initiation codon must be in phase with the reading frame of 
the desired coding sequence of the tagged ribosomal protein gene to ensure translation of 

25 the entire insert. These exogenous translational control signals and initiation codons can be 
of a variety of origins, both natural and synthetic. The efficiency of expression may be 
enhanced by the inclusion of appropriate transcription enhancer elements, transcription 
terminators, etc. (see Bittner et al 9 1987, Methods in Enzymol. 153: 516-44). 

The construct can also comprise one or more selectable markers that enable 

30 identification and/or selection of recombinant vectors. The selectable marker may be the 
tagged ribosomal protein gene product itself or an additional selectable marker not 
necessarily tied to the expression of the characterizing gene. 

In a specific embodiment, a nucleic acid of the invention is expressed conditionally, 
using any type of inducible or repressible system available for conditional expression of 

35 genes known in the art, e.g., a system inducible or repressible by tetracycline ("tet system"); 
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interferon; estrogen, ecdysone, or other steroid inducible system; Lac operator, progesterone 
antagonist RU486, or raparnycin (FK506). For example, a conditionally expressible nucleic 
acid of the invention can be created in which the coding region for the tagged ribosomal 
protein gene (and, optionally also the characterizing gene) is operably linked to a genetic 

5 switch, such that expression of the tagged ribosomal protein gene can be further regulated. 
One example of this type of switch is a tetracycline-based switch (see infra). In a specific 
embodiment, the tagged ribosomal protein gene product is the conditional enhancer or 
suppressor which, upon expression, enhances or suppresses expression of a selectable or 
detectable marker present either in the nucleic acid of the invention or elsewhere in the 

1 0 genome of the transformed organism. 

A conditionally expressible nucleic acid of the invention can be site-specifically 
inserted into an untranslated region (UTR) of genomic DNA of the characterizing gene, e.g., 
the 3' UTR or the 5' region, so that expression of the nucleic acid via the conditional 
expression system is induced or abolished by administration of the inducing or repressing 

15 substance, e.g., administration of tetracycline or doxycycline, ecdysone, estrogen, etc., 
without interfering with the normal profile of gene expression (see, e.g., Bond et al, 2000, 
Science 289: 1942-46; incorporated herein by reference in its entirety). In the case of a 
binary system, the detectable or selectable marker operably linked to the conditional 
expression elements is present in the nucleic acid of the invention, but outside the 

20 characterizing gene coding sequences and not operably linked to characterizing gene 
regulatory sequences or, alternatively, on another site in the genome of the transformed 
organism. 

Preferably, the nucleic acid of the invention comprises all or a significant portion of 
the genomic characterizing gene, preferably, at least all or a significant portion of the 5' 

25 regulatory sequences of the characterizing gene, most preferably, sufficient sequence 5' of 
the characterizing gene coding sequence to direct expression of the tagged ribosomal protein 
gene coding sequences in the same expression pattern (temporal and/or spatial) as the 
endogenous counterpart of the characterizing gene. In certain embodiments, the nucleic 
acid of the invention comprises one exon, two exons, all but one exon, or all but two exons, 

30 of the characterizing gene. 

Nucleic acids comprising the characterizing gene sequences and tagged ribosomal 
protein gene coding sequences can be obtained from any available source. In most cases, all 
or a portion of the characterizing gene sequences and/or the tagged ribosomal protein gene 
coding sequences are known, for example, in publicly available databases such as GenBank, 

35 UniGene and the Mouse Genome Informatic (MGI) Database to name just a few, or in 
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private subscription databases. With a portion of the sequence in hand, hybridization 
probes (for filter hybridization or PCR amplification) can be designed using highly routine 
methods in the art to identify clones containing the appropriate sequences (preferred 
methods for identifying appropriate BACs are discussed in Section 5.7.1, supra) for 

5 example in a library or other source of nucleic acid. If the sequence of the gene of interest 
from one species is known and the counterpart gene from another species is desired, it is 
routine in the art to design probes based upon the known sequence. The probes hybridize to 
nucleic acids from the species from which the sequence is desired, for example, 
hybridization to nucleic acids from genomic or DNA libraries from the species of interest. 

10 By way of example and not limitation, genomic clones can be identified by probing 

a genomic DNA library under appropriate hybridization conditions, e.g., high stringency 
conditions, low stringency conditions or moderate stringency conditions, depending on the 
relatedness of the probe to the genomic DNA being probed. For example, if the probe and 
the genomic DNA are from the same species, then high stringency hybridization conditions 

1 5 may be used; however, if the probe and the genomic DNA are from different species, then 
low stringency hybridization conditions may be used. High, low and moderate stringency 
conditions are all well known in the art. 

Procedures for low stringency hybridization are as follows (see also Shilo and 
Weinberg, 1981, Proc. Natl. Acad. Sci. USA 78:6789-6792): Filters containing DNA are 

20 pretreated for 6 hours at 40°C in a solution containing 35% formamide, 5X SSC, 50 mM 
Tris-HCl (pH 7.5), 5 mM EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 jag/ml 
denatured salmon sperm DNA. Hybridizations are carried out in the same solution with the 
following modifications: 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 (ag/ml salmon sperm 
DNA, 10% (wt/vol) dextran sulfate, and 5-20 X 10 6 cpm 32 P-labeled probe is used. Filters 

25 are incubated in hybridization mixture for 1 8-20 hours at 40°C, and then washed for 
1.5 hours at 55°C in a solution containing 2X SSC, 25 mM Tris-HCl (pH 7.4), 5 mM 
EDTA, and 0.1% SDS. The wash solution is replaced with fresh solution and incubated an 
additional 1.5 hours at 60°C. Filters are blotted dry and exposed for autoradiography. If 
necessary, filters are washed for a third time at 65-68 °C and reexposed to film. 

30 Procedures for high stringency hybridizations are as follows: Prehybridization of 

filters containing DNA is carried out for 8 hours to overnight at 65°C in buffer composed of 
6X SSC, 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.02% BSA, 
and 500 ng/ml denatured salmon sperm DNA. Filters are hybridized for 48 hours at 65°C 
in prehybridization mixture containing 100 jug/ml denatured salmon sperm DNA and 5-20 

35 X 10 6 cpm of 32 P-labeled probe. Washing of filters is done at 37°C for 1 hour in a solution 
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containing 2X SSC, 0.01% PVP 5 0.01% Ficoll, and 0.01% BSA. This is followed by a 
wash in 0.1 X SSC at 50°C for 45 minutes before autoradiography. 

Moderate stringency conditions for hybridization are as follows: Filters containing 
DNA are pretreated for 6 hours at 55°C in a solution containing 6X SSC, 5X Denhardt's 

5 solution, 0.5% SDS, and 100 |^g/ml denatured salmon sperm DNA. Hybridizations are 
carried out in the same solution and 5-20 X 10 6 CPM 32 P-labeled probe is used. Filters are 
incubated in the hybridization mixture for 18-20 hours at 55°C, and then washed twice for 
30 minutes at 60°C in a solution containing 1 X SSC and 0.1% SDS. 

With respect to the characterizing gene, all or a portion of the genomic sequence is 

10 preferred, particularly, the sequences 5' of the coding sequence that contain the regulatory 
sequences. A preferred method for identifying BACs containing appropriate and sufficient 
characterizing gene sequences to direct the expression of the tagged ribosomal protein gene 
coding sequences in substantially the same expression pattern as the endogenous 
characterizing gene is described in Section 5.7.1, supra. 

15 Briefly, the characterizing gene genomic sequences are preferably in a vector that 

can accommodate significant lengths of sequence (for example, 10 kb's of sequence), such 
as cosmids, YACs, and, preferably, BACs, and encompass at least 50, 70, 80, 100, 120, 150, 
200, 250 or 300 kb of sequence that comprises all or a portion of the characterizing gene 
sequence. The larger the vector insert, the more likely it is to identify a vector that contains 

20 the characterizing gene sequences of interest. Vectors identified as containing 

characterizing gene sequences can then be screened for those that are most likely to contain 
sufficient regulatory sequences from the characterizing gene to direct expression of the 
tagged ribosomal protein gene coding sequences in substantially the same pattern as the 
endogenous characterizing gene. In general, it is preferred to have a vector containing the 

25 entire genomic sequence for the characterizing gene. However, in certain cases, the entire 
genomic sequence cannot be accommodated by a single vector or such a clone is not 
available. In these instances (or when it is not known whether the clone contains the entire 
genomic sequence), preferably the vector contains the characterizing gene sequence with the 
start, i.e., the most 5' end, of the coding sequence in the approximate middle of the vector 

30 insert containing the genomic sequences and/or has at least 20 kb, 30 kb, 40 kb, 50 kb, 60 
kb, 80 kb or 100 kb of genomic sequence on either side of the start of the characterizing 
gene coding sequence. This can be determined by any method known in the art, for 
example, but not by way of limitation, by sequencing, restriction mapping, PCR 
amplification assays, etc. In certain cases, the clones used may be from a library that has 

35 been characterized {e.g. , by sequencing and/or restriction mapping) and the clones identified 
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can be analyzed, for example, by restriction enzyme digestion and compared to database 
information available for the library. In this way, the clone of interest can be identified and 
used to query publicly available databases for existing contigs correlated with the 
characterizing gene coding sequence start site. Such information can then be used to map 

5 the characterizing gene coding sequence start site within the clone. Alternatively, the 
tagged ribosomal protein gene sequences (or any other heterologous sequences) can be 
targeted to the 5' end of the characterizing gene coding sequence by directed homologous 
recombination (for example as described in Section 5.7) in such a way that a restriction site 
unique or at least rare in the characterizing gene clone sequence is introduced. The position 

10 of the integrated tagged ribosomal protein gene coding sequences (and, thus, the 5' end of 
the characterizing gene coding sequence) can be mapped by restriction endonuclease 
digestion and mapping. The clone may also be mapped using internally generated 
fingerprint data and/or by an alternative mapping protocol based upon the presence of 
restriction sites and the T7 and SP6 promoters in the BAG vector, as described in Section 

15 5.7.1, supra. 

In certain embodiments, the tagged ribosomal protein gene coding sequences are to 
be inserted in a site in the characterizing gene sequences other than the 5 f start site of the 
characterizing gene coding sequences, for example, in the 3 f most translated or untranslated 
regions. In these embodiments, the clones containing the characterizing gene should be 

20 mapped to insure the clone contains the site for insertion in as well as sufficient sequence 5 f 
of the characterizing gene coding sequences library to contain the regulatory sequences 
necessary to direct expression of the tagged ribosomal protein gene sequences in the same 
expression pattern as the endogenous characterizing gene. 

Once such an appropriate vector containing the characterizing gene sequences, the 

25 tagged ribosomal protein gene can be incorporated into the characterizing gene sequence by 
any method known in the art for manipulating DNA. In a preferred embodiment, 
homologous recombination in bacteria is used for target-directed insertion of the tagged 
ribosomal protein gene sequence into the genomic DNA encoding the characterizing gene 
and sufficient regulatory sequences to promote expression of the characterizing gene in its 

30 endogenous expression pattern, which characterizing gene sequences have been inserted 
into a BAC (see Section 5.7.1, supra). The BAC comprising the tagged ribosomal protein 
gene and characterizing gene sequences is then introduced into the genome of a potential 
founder organism for generating a line of transformed organisms, using methods well 
known in the art, e.g., those methods described in Section 5.7, supra. Such transformed 

35 organisms are then screened for expression of the tagged ribosomal protein gene coding 
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sequences that mimics the expression of the endogenous characterizing gene. Several 
different constructs containing nucleic acids of the invention may be introduced into several 
potential founder organisms and the resulting transformed organisms are then screened for 
the best, (e.g., highest level) and most accurate (best mimicking expression of the 
5 endogenous characterizing gene) expression of the tagged ribosomal protein gene coding 
sequences. 

The nucleic acid construct can be used to transform a host or recipient cell or 
organism using well known methods, e.g., those described in Section 5.6, supra. 
Transformation can be either a permanent or transient genetic change, preferably a 

10 permanent genetic change, induced in a cell following incorporation of new DNA (i.e., 
DNA exogenous to the cell). Where the cell is a mammalian cell, a permanent genetic 
change is generally achieved by introduction of the DNA into the genome of the cell. In one 
aspect of the invention, a vector is used for stable integration of the nucleic acid construct 
into the genome of the cell. Vectors include plasmids, retroviruses and other animal 

15 viruses, BACs, YACs, and the like. 

5.11- EXPRESSION USING A BINARY SYSTEM 

Since the level of expression of the tagged ribosomal protein within a cell may be 
important in the efficiency of the isolation procedure, in certain embodiments of the 

20 invention, a binary system can be used, in which the endogenous promoter drives 
expression of a protein that then activates a second expression construct. This second 
expression construct uses a strong promoter to drive expression of the tagged ribosomal 
protein at higher levels than is possible using the endogenous promoter itself. 

In certain embodiments, a particular population-specific gene drives expression of a 

25 molecular switch (e.g. , a recombinase, a transactivator) in a population-specific manner. 
This switch then activates high-level expression though a second regulatory element 
regulating expression of the tagged ribosomal protein. 

For example, the molecularly tagged ribosomal protein coding sequence may be 
expressed conditionally, through the activity of a molecular switch gene which is an 

30 activator or suppressor of gene expression. In this case, the second gene encodes a 
transactivator, e.g., tetR, a recombinase, or FLP, whose expression is regulated by the 
characterizing gene regulatory sequences. The gene encoding the molecularly tagged 
ribosomal protein is linked to a conditional element, e.g., the tet promoter, or is flanked by 
recombinase sites, e.g., FRT sites, and may be located any where within the genome. In 

35 
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such a system, expression of the molecular switch gene, as regulated by the characterizing 
gene regulatory sequences, activates the expression of the molecular tag. 

5.12. CONDITIONAL TRANSCRIPTIONAL REGULATION SYSTEMS 

5 In certain embodiments, the tagged ribosomal protein gene can be expressed 

conditionally by operably linking at least the coding region for the tagged ribosomal protein 
gene to all or a portion of the regulatory sequences from the characterizing gene, and then 
operably linking the tagged ribosomal protein gene coding sequences and characterizing 
gene sequences to an inducible or repressible transcriptional regulation system. 

1 0 Transactivators in these inducible or repressible transcriptional regulation systems 

are designed to interact specifically with sequences engineered into the vector. Such 
systems include those regulated by tetracycline ("tet systems"), interferon, estrogen, 
ecdysone, Lac operator, progesterone antagonist RU486, and rapamycin (FK506) with tet 
systems being particularly preferred (see, e.g., Gingrich and Roder, 1998, Annu. Rev. 

1 5 Neurosci. 21 : 377-405; incorporated herein by reference in its entirety). These drugs or 
hormones (or their analogs) act on modular transactivators composed of natural or mutant 
ligand binding domains and intrinsic or extrinsic DNA binding and transcriptional 
activation domains. In certain embodiments, expression of the detectable or selectable 
marker can be regulated by varying the concentration of the drug or hormone in medium in 

20 vitro or in the diet of the transformed organism in vivo. 

The inducible or repressible genetic system can restrict the expression of the 
detectable or selectable marker either temporally, spatially, or both temporally and spatially. 

In a preferred embodiment, the control elements of the tetracycline-resistance operon 
of E. coli is used as an inducible or repressible transactivator or transcriptional regulation 

25 system ("tet system") for conditional expression of the detectable or selectable marker. A 
tetracycline-controlled transactivator can require either the presence or absence of the 
antibiotic tetracycline, or one of its derivatives, e.g., doxycycline (dox), for binding to the 
tet operator of the tet system, and thus for the activation of the tet system promoter (Ptet). 
Such an inducible or repressible tet system is preferably used in a mammalian cell. 

30 In a specific embodiment, a tetracycline-repressed regulatable system (TrRS) is used 

(Agha-Mohammadi and Lotze, 2000, J. Clin. Invest. 105(9): 1 177-83; incorporated herein 
by reference in its entirety). This system exploits the specificity of the tet repressor (tetR) 
for the tet operator sequence (tetO), the sensitivity of tetR to tetracycline, and the activity of 
the potent herpes simplex virus transactivator (VP 16) in eukaryotic cells. The TrRS uses a 

35 conditionally active chimeric tetracycline-repressed transactivator (tTA) created by fusing 



110 



WO 03/038049 



PCT/US02/34645 



the COOH-terminal 127 amino acids of vision protein 16 (VP 16) to the COOH terminus of 
the tetR protein (which may be the tagged ribosomal protein gene). In the absence of 
tetracycline, the tetR moiety of tTA binds with high affinity and specificity to a tetracycline- 
regulated promoter (tRP), a regulatory region comprising seven repeats of tetO placed 

5 upstream of a minimal human cytomegalovirus (CMV) promoter or (3-actin promoter (0- 
actin is preferable for neural expression). Once bound to the tRP, the VP 16 moiety of tTA 
transactivates the detectable or selectable marker gene by promoting assembly of a 
transcriptional initiation complex. However, binding of tetracycline to tetR leads to a 
conformational change in tetR accompanied with loss of tetR affinity for tetO, allowing 

10 expression of the tagged ribosomal protein gene to be silenced by administering 

tetracycline. Activity can be regulated over a range of orders of magnitude in response to 
tetracycline. 

In another specific embodiment, a tetracycline-induced regulatable system is used to 
regulate expression of a detectable or selectable marker, e.g., the tetracycline transactivator 
15 (tTA) element of Gossen and Bujard (1992, Proc. Natl. Acad. Sci. USA 89: 5547-51; 
incorporated herein by reference in its entirety). 

In another specific embodiment, the improved tTA system of Shockett et ah (1995, 
Proc. Natl. Acad. Sci. USA 92: 6522-26, incorporated herein by reference in its entirety) is 
used to drive expression of the marker. This improved tTA system places the tTA gene 
20 under control of the inducible promoter to which tTA binds, making expression of tTA 
itself inducible and autoregulatory. 

In another embodiment, a reverse tetracycline-controlled transactivator, e.g., rtTA2 
S-M2, is used. rtTA2 S-M2 transactivator has reduced basal activity in the absence 
doxycycline, increased stability in eukaryotic cells, and increased doxycycline sensitivity 
25 (Urlinger et ah , 2000, Proc. Natl. Acad. Sci. USA 97(14): 7963-68; incorporated herein by 
reference in its entirety). 

In another embodiment, the tet-repressible system described by Wells et ah (1999, 
Transgenic Res. 8(5): 371-81; incorporated herein by reference in its entirety) is used. In 
one aspect of the embodiment, a single plasmid Tet-repressible system is used. Preferably, 
30 a "mammalianized" TetR gene, rather than a wild-type TetR gene (tetR) is used (Wells et 
ah, 1999, Transgenic Res. 8(5): 371-81). 

In other embodiments, expression of the tagged ribosomal protein gene is regulated 
by using a recombinase system that is used to turn on or off tagged ribosomal protein gene 
expression by recombination in the appropriate region of the genome in which the marker 
35 gene is inserted. Such a recombinase system, in which a gene that encodes a recombinase 
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can be used to turn on or off expression of the tagged ribosomal protein gene (for review of 
temporal genetic switches and "tissue scissors" using recombinases, see Hennighausen and 
Furth, 1999, Nature Biotechnol. 17: 1062-63). Exclusive recombination in a selected cell 
type may be mediated by use of a site-specific recombinase such as Cre, FLP-wild type (wt), 

5 FLP-L or FLPe. Recombination may be effected by any art-known method, e.g. , the method 
of Doetschman et al (1987, Nature 330: 576-78; incorporated herein by reference in its 
entirety); the method of Thomas et al, (1986, Cell 44: 419-28; incorporated herein by 
reference in its entirety); the Cre-loxP recombination system (Sternberg and Hamilton, 
1981, J. Mol. Biol. 150: 467-86; Lakso et al, 1992, Proc. Natl. Acad. Sci. USA 89: 6232- 

10 36; which are incorporated herein by reference in their entireties); the FLP recombinase 
system of Saccharomyces cerevisiae (O'Gorman et al, 1991, Science 251 : 1351-55); the 
Cre-loxP-tetracycline control switch (Gossen and Bujard, 1992, Proc. Natl. Acad. Sci. USA 
89: 5547-51); and ligand-regulated recombinase system (Kellendonk et al, 1999, J. Mol. 
Biol. 285: 175-82; incorporated herein by reference in its entirety). Preferably, the 

15 recombinase is highly active, e.g., the Cre-loxP or the FLPe system, and has enhanced 

thermostability (Rodriguez et al, 2000, Nature Genetics 25: 139-40; incorporated herein by 
reference in its entirety). 

In certain embodiments, a recombinase system can be linked to a second inducible 
or repressible transcriptional regulation system. For example, a cell-specific Cre-loxP 

20 mediated recombination system (Gossen and Bujard, 1992, Proc. Natl. Acad. Sci. USA 89: 
5547-51) can be linked to a cell-specific tetracycline-dependent time switch detailed above 
(Ewald etal, 1996, Science 273: 1384-1386; Furth et al Proc. Natl. Acad. Sci. U.S.A. 91: 
9302-06 (1994); St-Onge et al, 1996, Nucleic Acids Research 24(19): 3875-77; which are 
incorporated herein by reference in their entireties). 

25 In one embodiment, an altered ere gene with enhanced expression in mammalian 

cells is used (Gorski and Jones, 1999, Nucleic Acids Research 27(9): 2059-61; incorporated 
herein by reference in its entirety). 

In a specific embodiment, the ligand-regulated recombinase system of Kellendonk et 
al. (1999, J. Mol. Biol. 285: 175-82; incorporated herein by reference in its entirety) can be 

30 used. In this system, the ligand-binding domain (LBD) of a receptor, e.g. , the progesterone 
or estrogen receptor, is fused to the Cre recombinase to increase specificity of the 
recombinase. 
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5.13. METHODS OF SCREENING FOR EXPRESSION OF 
MOLECULARLY TAGGED RIBOSOMAL PROTEIN CODING 
SEQUENCES 

In preferred embodiments, the invention provides a collection of lines of 
5 transformed organisms that contain a selected subset of cells or cell population expressing 
molecularly-tagged ribosomes. The collection comprises at least two individual lines, 
preferably at least five individual lines. Each individual line is selected for the collection 
based on the identity of the subset of cells in which the molecularly tagged ribosomes are 
expressed. 

10 Potential founder organisms for a line of transformed organisms can be screened for 

expression of the molecularly tagged ribosomal protein coding sequence by ribosomes in 
the population of cells characterized by expression of the endogenous characterizing gene. 

Transformed organisms that exhibit appropriate expression (e.g., detectable 
expression having substantially the same expression pattern as the endogenous 

1 5 characterizing gene in a corresponding non-transformed organism or anatomical region 
thereof, i.e., detectable expression in at least 80%, 90% or, preferably, 95% of the cells 
shown to express the endogenous gene by in situ hybridization) are selected as lines of 
transformed organisms. 

In a preferred embodiment, immunohistochemistry using an antibody specific for the 

20 molecular tag or a marker activated or repressed thereby is used to detect expression of the 
molecular tag. 

5.14. PROFILING OF mRNA SPECIES 

Once isolated, the mRNA bound by the tagged ribosomal proteins or mRNA binding 
25 proteins of the invention can be analyzed by any method known in the art. In one aspect of 
the invention, the gene expression profile of cells expressing the tagged ribosomal proteins 
or mRNA binding proteins is analyzed using any number of methods known in the art, for 
example but not by way of limitation, by isolating the mRNA and constructing cDNA 
libraries or by labeling the RNA for gene expression analysis. 
30 In a preferred embodiment, poly-A + RNA (mRNA) is isolated from the tagged 

ribosomal proteins or mRNA binding proteins of the invention, and converted to cDNA 
through a reverse transcription reaction primed by a first primer that comprises an oligo-dT 
sequence. The first primer is contacted with the poly-A + RNA under conditions that allow 
the oligo-dT site to hybridize to the first selected sequence (i.e., the poly- A sequence). 

35 
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Alternatively, the first primer comprises a sequence that is the reverse complement of a 
specific selected sequence (for example, a sequence characteristic of a family of mRNAs). 

The first primer is then used to prime synthesis of a first-strand cDNA by reverse 
transcription of the source single-stranded nucleic acid. When the source nucleic acid is 

5 mRNA, a RNA-dependent DNA polymerase activity is required to convert the 

primer-source mRNA hybrid to a first-strand cDNA- source mRNA hybrid. A reverse 
transcriptase can be used to catalyze RNA-dependent DNA polymerase activity. 

Reverse transcriptase is found in all retroviruses and is commonly derived from 
Moloney murine leukemia virus (M-MLV-RT), avian myeloblastosis virus (AMV-RT), 

10 bovine leukemia virus (BLV-RT), Rous sarcoma virus (RSV-RT), and human 

immunodeficiency virus (HIV-RT); enzymes from these sources are commercially available 
(e.g., Life Technologies-Gibco BRL, Rockville, MD; Roche Molecular Biochemicals, 
Indianapolis, IN; PanVera, Madison WI). 

A single reverse transcriptase or a combination of two or more reverse transcriptases 

15 (e.g., M-MLV-RT and AMV-RT) can be used to catalyze reverse transcription and first- 
strand cDNA synthesis. Such reverse transcriptases are used to convert a primer-single- 
stranded nucleic acid (mRNA) hybrid to a first-strand cDNA-primer-single-stranded nucleic 
acid hybrid in the presence of additional reagents that include, but are not limited to: 
dNTPs; monovalent and divalent cations, e.g., KC1, MgCl 2 ; sulfhydryl reagents, e.g., 

20 dithiothreitol (DTT); and buffering agents, e.g., Tris-Cl. 

As described below (second-strand cDNA synthesis), the catalytic activities required 
to convert a first-strand cDNA-single-stranded nucleic acid hybrid to ds cDNA are an 
RNase H activity and a DNA-dependent DNA polymerase activity. Most reverse 
transcriptases, such as the ones described above (i.e., M-MLV-RT, AMV-RT, BLV-RT, 

25 RSV-RT, and HIV-RT) also catalyze each of these activities. Therefore, in certain 

embodiments, the reverse transcriptase employed for first-strand cDNA synthesis remains in 
the reaction mixture where it can also serve to catalyze second-strand cDNA synthesis. 
Alternatively, a variety of proteins that catalyze one or two of these activities can be added 
to the cDNA synthesis reaction. Such proteins may be added together during a single 

30 reaction step, or added sequentially during two or more substeps. 

Preferably a reverse transcriptase lacking RNase H activity is used, in particular 
when long transcripts are desired. For example, M-MLV reverse transcriptase lacking 
RNase H activity (Kotewicz et al 9 U.S. Patent No. 5,405,776, issued April 11, 1995; 
commercially available as Superscript II™ (Life Technologies - Gibco BRL) can be used 

35 to catalyze both RNA-dependent DNA polymerase activity and DNA-dependent DNA 
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polymerase activity. In a preferred embodiment, SUPERSCRIPT II™ (Life Technologies - 
Gibco BRL) is used as a source of DNA polymerase activity. This DNA polymerase can be 
used to synthesize a complementary DNA strand from single-stranded RNA, DNA, or an 
RNA:DNA hybrid. SUPERSCRIPT II™ is genetically engineered by the introduction of point 

5 mutations that greatly reduce its RNase H activity but preserve full DNA polymerase 
activity. The structural modification of the enzyme therefore eliminates almost all 
degradation of RNA molecules during first-strand cDNA synthesis. 

In certain embodiments, the reverse transcriptase is inactivated after first-strand 
synthesis. The reverse transcriptase may be rendered inactive using any convenient 

10 protocol. The transcriptase may be irreversibly or re versibly rendered inactive. Where the 
transcriptase is reversibly rendered inactive, the transcriptase is physically or chemically 
altered so as to no longer be able to catalyze RNA-dependent DNA polymerase activity. 

The reverse transcriptase may be irreversibly inactivated by any convenient means. 
In certain embodiments, the reverse transcriptase is heat inactivated. The reaction mixture 

15 is subjected to heating to a temperature sufficient to inactivate the reverse transcriptase prior 
to commencement of the transcription step. In these embodiments, the temperature of the 
reaction mixture, and therefore the reverse transcriptase present therein, is typically raised to 
55°C to 70°C for 5 to 60 minutes, preferably to about 65°C for 15 minutes. In apreferred 
embodiment, the transcriptase is inactivated by adding 1M KOH to the reaction mixture, 

20 preferably to make a final concentration of 50 mM KOH in the reaction mixture, and by 
incubating at 65 °C for 15 min prior to commencement of the transcription step. This step 
ensures that contaminating non-poly-A RNA is removed from the sample, making the 
subsequent tailing reaction more efficient. 

Alternatively, reverse transcriptase may irreversibly inactivated by introducing a 

25 reagent into the reaction mixture that chemically alters the protein so that it no longer has 
RNA-dependent DNA polymerase activity. 

In a preferred embodiment, the reverse transcription reaction to synthesize the first- 
strand cDNA proceeds at 42 °C for 30-40 min using SUPERSCRIPT II™ as the source of 
reverse transcriptase / DNA polymerase. 

30 The transcribed first-strand cDNA may be isolated from the source RNA to which it 

is hybridized by any of wide variety of established methods. For example, the isolation 
method may involve treating the RNA with a nuclease such as RNase H, a denaturant such 
as heat or an alkali, etc., and/or separating the strands by electrophoresis. The second strand 
of cDNA can be synthesized using methods well known in the art, for example using 

35 
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reverse transcriptase which primes from the hairpin loop structure that forms at the 3' end of 
the first strand of cDNA. 

Gene expression in cells treated and not treated with a compound of interest or in 
cells from animals treated or untreated with a particular treatment, e.g., pharmaceutical or 

5 surgical treatment, may be compared. In addition, mRNA bound by the tagged ribosomal 
proteins or mRNA binding proteins may also be analyzed, for example by northern blot 
analysis, PCR, RNase protection, etc., for the presence of mRNAs encoding certain protein 
products and for changes in the presence or levels of these mRNAs depending on the 
treatment of the cells. In specific embodiments, the mRNA is isolated from different 

10 populations of cells or from populations of cells exposed to different stimuli. 

In another aspect, mRNA bound by the tagged ribosomal proteins or mRNA binding 
proteins may be used to produce a cDNA library and, in fact, a collection of such cell type 
specific cDNA libraries may be generated from different populations of isolated cells. Such 
cDNA libraries are useful to analyze gene expression, isolate and identify cell type-specific 

15 genes, splice variants and non-coding RNAs. In another aspect, such cell-type specific 
libraries prepared from mRNA bound by, and isolated from, the tagged ribosomal proteins 
or mRNA binding proteins from treated and untreated transgenic animals of the invention or 
from transgenic animals of the invention having and not having a disease state can be used, 
for example in subtractive hybridization procedures, to identify genes expressed at higher or 

20 lower levels in response to a particular treatment or in a disease state as compared to 

untreated transgenic animals. The mRNA isolated from the tagged ribosomal proteins or 
mRNA binding proteins may also be analyzed using particular microarrays generated and 
analyzed by methods well known in the art. Gene expression analysis using microarray 
technology is well known in the art. Methods for making microarrays are taught, for 

25 example, in United States Patent No. 5,700,637 by Southern, United States Patent No. 

5,510,270 by Fodor et al. and PCT publication WO 99/35293 by Albrecht et al, which are 
incorporated by reference in their entireties. By probing a microarray with various 
populations of mRNAs, transcribed genes in certain cell populations can be identified. 
Moreover, the pattern of gene expression in different cell types of cell states may be readily 

30 compared. 

Data from such analyses may be used to generate a database of gene expression 
analysis for different populations of cells in the animal or in particular tissues or anatomical 
regions, for example, in the brain. Using such a database together with bioinformatics tools, 
such as hierarchical and non-hierarchical clustering analysis and principal components 
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analysis, cells are "fingerprinted" for particular indications from healthy and disease-model 
animals or tissues. 

In yet another embodiment, specific cells or cell populations that express a potential 
a molecularly tagged ribosomal protein or mRNA binding protein are isolated from the 

5 collection and analyzed for specific protein-protein interactions or an entire protein profile 
using proteomics methods known in the art, for example, chromatography, mass 
spectroscopy, 2D gel analysis, etc. 

Other types of assays may be used to analyze the cell population expressing the 
molecularly tagged ribosomal protein or mRNA binding protein, either in vivo, in explanted 

10 or sectioned tissue or in the isolated cells, for example, to monitor the response of the cells 
to a certain treatment or candidate compound or to compare the response of the animals, 
tissue or cells to expression of the target or inhibitor thereof, with animals, tissue or cells 
from animals not expressing the target or inhibitor thereof. The cells may be monitored, for 
example, but not by way of limitation, for changes in electrophysiology, physiology (for 

15 example, changes in physiological parameters of cells, such as intracellular or extracellular 
calcium or other ion concentration, change in pH, change in the presence or amount of 
second messengers, cell morphology, cell viability, indicators of apoptosis, secretion of 
secreted factors, cell replication, contact inhibition, etc.), morphology, etc. 

In particular embodiments, the isolated mRNA is used to probe a comprehensive 

20 expression library (see, e.g., Serafini et al, United States Patent No. 6,1 10,71 1, issued 
August 29, 2000, which is incorporated by reference herein). The library may be 
normalized and presented in a high density array. Because approximately one tenth of the 
mRNA species in a typical somatic cell constitute 50% to 65% of the mRNA present, the 
cDNA library may be normalized using reassociation-kinetics based methods. (See Soares, 

25 1997, Curr. Opin. Biotechnol. 8:542-546). 

In a particular embodiment, a subpopulation of cells expressing a molecularly tagged 
ribosomal protein or mRNA binding protein is identified and/or gene expression analyzed 
using the methods of Serafini et al. , WO 99/29877 entitled "Methods for defining cell 
types," which is hereby incorporated by reference in its entirety. 

30 

6. EXAMPLE 1: TAGGING OF RIBOSOMAL PROTEINS 

6.1. ISOLATION OF RIBOSOMAL PROTEIN-ENCODING cDNAs 

This example demonstrates the successful introduction of a Strep-tag into ribosomal 
3 5 subunit protein-encoding cDNAs. 
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Oligonucleotides complementary to the sequence of ribosomal subunit proteins, S6, 
and L37 were designed to permit PCR amplification of the cDNAs from reverse transcribed 
mRNA. EcoRI and NotI restrictions sites were incorporated into the 5' terminal ends of the 
5' and 3' specific oligonucleotides to facility the subcloning of the amplified cDNAs into the 
5 expression vector pCDNA3.1+. The sequence of the oligonucleotide sets were as follows: 

S6 

5' oligo. GGAATTCATTCAAGATGAAGCTGAACATCTCCTTCCC (SEQ ID NO: 7) 
3' oligo. GCGGCCGCTTTTCTGACTGGATTCAGACTTAGAAGTAGAAGCT (SEQ ID 
10 NO: 8) 

L37 

5' oligo. GGAATTCCCGGCGACATGGCTAAACGCACCAAGAAGG (SEQ ID NO: 9) 
3' oligo. GCGGCCGCTCTGGTCTTTCAGTTCCTTCAGTCTTCTGAT (SEQ ID NO: 10) 

15 

S20 

5' oligo. GGAATTCGCGCGCAACAGCCATGGCTTTTAAGGATAC (SEQ ID NO: 11) 
3' oligo. GCGGCCGCTAGCATCTGCAATGGTGACTTCCACCTCAAC (SEQ ID NO: 
12) 

20 

L32 

5' oligo. GGAATTCGGCATCATGGCTGCCCTTCGGCCTCTGGTG (SEQ ID NO: 13) 
3' oligo. GCGGCCGCTTTCATTCTCTTCGCTGCGTAGCCTGGC (SEQ ID NO: 14) 

25 Mouse brain cDNA (Clontech) was used as the template for a polymerase chain 

reaction (PCR). 50 mL PCR aliquots were prepared for each set of primer pairs. Each 
reaction consisted of 40 mL PCR-grade water, 5 mL 10X Advantage 2 PCR Buffer 
(Clontech), 1 mL mouse brain cDNA template, 1 mL each 5' and 3' oligonucleotide primer 
(10 mM), 1 mL dNTP mix (10 mM each dATP, dCTP, dTTP, and dGTP), and 1 mL 50X 

30 Advantage 2 Polymerase Mix. 

The PCR reaction was carried out under the following conditions: 

1. 95°Cfor 1 minute 

2. 30 cycles of 95 °C for 15 seconds and 68 °C for 1 minute 

35 
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10 mL of each reaction was analyzed by electrophoresis through a 1.2% agarose gel 
in TAE. The remainder of the reaction was purified using a QIAGEN QUICKSPIN PGR 
reaction purification kit following the manufacturer's protocol. 

Purified DNA was digested with EcoRI and NotI followed by electrophoresis 
5 through a 1 .2% agarose gel, isolation of the DNA fragment, and extraction of the DNA 
from the gel using a QIAGEN QUICKSPIN Gel isolation kit following the manufacturer's 
protocol. 

Each cDNA fragment was ligated to pCDNA3.1+, which had been digested with 
EcoRI and NotI. Ligated DNA was used to transform chemically competent DH5a bacteria. 
10 Transformed bacteria were plated onto LB plates containing 100 mg/mL ampicillin. For 
each ligation, 3 ampicillin resistant colonies were picked, grown in 5 mL LB cultures 
containing 100 mg/mL ampicillin. 

The cultures were incubated for 16 hours on a shaking platform at 37 °C. Plasmid 
DNA was isolated from the cultures using a QIAGEN miniprep kit following the 
1 5 manufacturer's protocol. Plasmid DNA was digested with Pmel and analyzed on a 1 .2% 
agarose gel to identify plasmids that contain the cDNA insert. 

6.2. ADDITION OF STREP-TAG TO THE RIBOSOMAL SUBUNIT 
PROTEINS 

20 The amino acid sequence Trp-Ser-His-Pro-Gln-Phe-Glu-Lys (SEQ ID NO: 17) 

represents Strep-tag II, a peptide that is able to bind with high affinity to the protein 
Streptavidin. Proteins that contain the Strep-tag II can be identified and isolated through 
affinity to Streptavidin. Strep-tag II was added to each of the ribosomal subunit proteins, S6, 
S20, L32 5 and L37 ? at the C-terminus of the protein. Two complementary oligonucleotide 

25 adaptors were designed that encode for Strep-tag II. These complementary oligonucleotide 
adaptors, when hybridized to form a double stranded DNA, can be ligated in-frame to the 
ribosomal subunit cDNAs in the vector pCDNA3.1+. 

The sequences of the Strep-tag II oligonucleotides were: 

30 Upper strand oligonucleotide 

5' GGCCGCAGCGCTTGGAGCCACCCGCAGTTCGAAAAATAA 3' (SEQ ID NO: 15) 

Bottom strand oligonucleotide 

5' TCGATTATTTTTCGAACTGCGGGTGGCTCCAAGCGCTGC 3' (SEQ ID NO: 16) 

35 
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Each of the plasmids containing the ribosomal subunit protein-encoding cDNAs was 
digested with NotI and Xhol. The upper strand and bottom strand oligonucleotides were 
mixed in equal molar ratios, heated to 70 °C , and allowed to cool to room temperature. The 
hybridized oligonucleotides were then ligated to the NotI and Xhol digested plasmids. The 

5 ligation reactions were transformed into competent DH5a bacteria and plated onto LB plates 
supplemented with 100 mg/mL ampicillin. 

For each ligation, five ampicillin resistant colonies were picked into 5 mL LB 
cultures containing 100 mg/mL ampicillin. The cultures were grown at 37 °C for 16 hours. 
Plasmid DNA was harvested, cut with Pmel, and analyzed by electrophoresis through 5% 

10 non denaturing polyacrylamide gels. Untagged ribosomal subunit protein encoding cDNAs 
were also digested with Pmel, and run side by side with the tagged versions to identify the 
cDNAs that contained the strep-tagll sequence. All tagged cDNAs were then sequenced to 
confirm the sequence of each cDNA. 

1 5 7. EXAMPLE 2: ISOLATION AND IMMUNOPRECIPITATION OF 

POLYSOMES 

7.1. POLYSOME ISOLATION 

Plasmid constructs expressing tagged ribosomal proteins were transfected into 
Human Embryonic Kidney (HEK293) cells using the transfection reagent FuGENE 6 

20 (Roche Applied Science) following the manufacturer's procedures. Briefly, for each 
transfection, 100 mL of serum free medium (DMEM) was placed into a sterile tube, 
followed by the addition of three mL of Fugene 6 and 1 mg of plasmid DNA. The Fugene 
6/DNA mixture was allowed to incubate at room temperature for 15 minutes before being 
added to a 60 mm plate of HEK293 cells grown in DMEM supplemented with 10% fetal 

25 calf serum, glutamine, and antibiotics. 

Three days after transfection, the cells were harvested by scraping into 
homogenization buffer (50 mM sucrose, 200 mM ammonium chloride, 7 raM magnesium 
acetate, 1 mM dithiothreitol, and 20 mM Tris-HCl, pH 7.6). The cells were lysed by the 
addition of the detergent, NP-40, to a concentration of 0.5% followed by five strokes in a 

30 glass dounce tissue homogenizer. Unlysed cells, nuclei and mitochondria were pelleted by 
centrifogation at 10,000Xg for 10 minutes, at 4°C. The supernatant was carefully removed 
and layered over a two-step discontinuous gradient of 1.8 M and 1.0M sucrose in 100 mM 
ammonium chloride, 5 mM magnesium acetate, 1 mM dithiothreitol, 20 mM Tris-HCl (pH 
7.6). The gradient was centrifuged for 18 hours at 98,000Xg at 4°C. 
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Following centrifugation, the supernatants were carefully removed, and the 
polysome pellet was resuspended in 100 mM ammonium chloride, 5 mM magnesium 
chloride, 1 mM DTT and 20 mM Tris-HCl (pH 7.6). 

An equal volume of 2X denaturing protein electrophoresis sample buffer was added 

5 to each of the polysome samples. Solubilized polysomal proteins were fractionated by 
electrophoresis through a SDS containing 4- 20% gradient polyacrylamide gel, and 
transferred to a nitrocellulose filter. The filter was quenched for 1 hour in PBS containing 
5% dry milk followed by incubation with rabbit antisera specific for the strep-tag II amino 
acid sequence epitope Trp-Ser-His-Pro-Gln-Phe-Glu-Lys (SEQ ID NO: 17). The filters were 

1 0 rinsed three times in PBS for 20 minutes each, followed by a one hour incubation with a 
goat anti-rabbit antisera that had been conjugated to horse radish peroxidase (HRP), in PBS 
containing 10% dry milk. The filters were then washed for three times in PBS. The HRP 
was detected by incubating the filter in 20 mL of PBS, containing 4-chlornaphtol and 
hydrogen peroxidase. 

15 As seen in FIG. 1, polysomes from cells transfected with plasmids expressing tagged 

versions of ribosomal proteins S6 (lane 2, in duplicate), L32 (lane 4, in duplicate, not easily 
seen in the reproduction), and L37 (lane 5, in duplicate) contain proteins that are reactive to 
the anti-streptag II antibodies. These proteins correspond to the predicted molecular weights 
of the S6 (34 kDa), L32 (52kDa), and L37 (9kDa). The S6 and L37 proteins appear to be 

20 more abundantly represented in the polysomal fraction compared to the L32 protein, which 
is difficult to visualize in the figure but is present upon close inspection of the original filter. 
Tagged S20 (lane 3, in duplicate) does not appear to be present in the polysomal fraction. 
Polysomes from untransfected cells (lane 1, in duplicate) do not display any 
immunoreactive material. 

25 

7.2. POLYSOME IMMUNOPRECIPITATION 

HEK 293 cells were transfected with plasmid constructs expressing tagged 
ribosomal proteins S6 or L37 and homogenized as above. Unlysed cells, nuclei, and 
mitochondria were removed by centrifugation at 10,000 X g for 10 minutes. 5 micrograms 

30 of an anti-streptag rabbit polyclonal antisera was added to the supernatant and incubated at 
4°C for 72 hours. 100 microliters of a protein A sepharose slurry was then added and 
incubation continued at 4°C for one hour. The sepharose beads were pelleted by 
centrifugation at 1,000 X g for 5 minutes. The supernatant was removed, and the pellet was 
resuspended in 10 mLs of fresh homogenization buffer. This procedure was repeated three 

35 times. 
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RNA was harvested from the protein A sepharose pellet using an RNA isolation kit 
(Ambion). Briefly, the pellets were solubilized in 600 microliters of homogenization buffer, 
followed by the addition of 600 microliters of 64% EtOH. This mixture was applied to the 
spin column provided by the kit, followed by centrifugation at 10,000 X g for 1 minute. The 

5 column was sequentially washed in the two wash buffers provided with the kit. RNA bound 
to the column was released by the addition of elution buffer heated to 95 °C. RNA was 
visualized by electrophoresis through an ethidium bromide containing agarose gel. 

As seen in FIG. 2, ribosomal RNA is present (arrow) in material 
immunoprecipitated from tagged S6 (lane 2) transfectants. Such RNA is also present at low 

1 0 levels in material from tagged L37 transfectants (lane 3; not easily seen in reproduction). 
Such RNA is not present in material from untransfected cells (lane 1). 

All references cited herein are incorporated herein by reference in their entireties and 
for all purposes to the same extent as if each individual publication, patent or patent 

1 5 application was specifically and individually indicated to be incorporated by reference in its 
entirety for all purposes. 

The citation of any publication is for its disclosure prior to the filing date and should 
not be construed as an admission that the present invention is not entitled to antedate such 
publication by virtue of prior invention. 

20 Many modifications and variations of this invention can be made without departing 

from its spirit and scope, as will be apparent to those skilled in the art. The specific 
embodiments described herein are offered by way of example only, and the invention is to 
be limited only by the terms of the appended claims along with the full scope of equivalents 
to which such claims are entitled. 



30 
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WHAT IS CLAIMED IS: 



1 . A method of isolating mRNA from a population of cells, said method 
comprising 

5 (a) contacting a lysate or fraction of said population of cells with a reagent, 

wherein said population of cells contains one or more cells having tagged 
ribosomes comprising a ribosomal protein, or functional fragment thereof, 
fused to a peptide tag, said peptide tag in said tagged ribosome binding 
specifically to said reagent, and said peptide tag not being a ribosomal 

10 protein or fragment thereof; and wherein said tagged ribosome is bound to 

said mRNA; 

(b) isolating said tagged ribosomes bound by said reagent; and 

(c) isolating said mRNA from said tagged ribosomes. 



15 2. The method of claim 1 wherein the peptide tag is streptavidin and the reagent 

specifically binds streptavidin. 

3. The method of claim 1 wherein the ribosomal protein is S6, L32, or L37. 

20 4. The method of claim 1 wherein the population of cells comprises two or 

more cell types and wherein said tagged ribosomes are present only in one cell type in said 
population. 

5. The method of claim 1 wherein the ribosomal protein fused to a peptide tag 
25 is encoded by a nucleic acid comprising a first nucleotide sequence encoding the ribosomal 
protein, or functional fragment thereof, fused to a second nucleotide sequence encoding the 
peptide tag, wherein the expression of the nucleic acid is regulated by a non-ribosomal 
protein regulatory sequence. 



30 6. A method of isolating mRNA from a population of cells, said method 

comprising 

(a) contacting a lysate or fraction of said population of cells with a reagent, 
wherein said population of cells contains one or more cells having tagged 
mRNA binding proteins comprising a mRNA binding protein, or functional 
35 fragment thereof, fused to a peptide tag, said peptide tag on said tagged 
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mRNA binding protein binding specifically to said reagent, and said peptide 
tag not being a mRNA binding protein; and wherein said tagged mRNA 
binding protein is bound to said mRNA; 
(b) isolating said tagged mRNA binding protein bound by said reagent; and 
5 (c) isolating said mRNA from said tagged mRNA binding protein. 

7. The method of claim 6 wherein the mRNA binding protein is not polyA 
binding protein. 

10 8. The method of claim 6 wherein the peptide tag is streptavidin and the reagent 

specifically binds streptavidin. 

9. The method of claim 6 wherein the population of cells comprises two or 
more cell types and wherein said tagged mRNA binding protein is present only in one cell 

1 5 type in said population. 

10. The method of claim 6 wherein the tagged mRNA binding protein is encoded 
by a nucleic acid comprising a first nucleotide sequence encoding the mRNA binding 
protein, or functional fragment thereof, fused to a second nucleotide sequence encoding the 

20 peptide tag, wherein the expression of the nucleic acid is regulated by a non-mRNA binding 
protein regulatory sequence. 

11. A non-human transgenic animal comprising a transgene comprising a 
nucleotide sequence encoding a ribosomal protein fusion protein, wherein the ribosomal 

25 protein fusion protein comprises a ribosomal protein, or functional fragment thereof, fused 
to a peptide tag, which peptide tag is not a ribosomal protein or portion thereof; wherein 
when said ribosomal protein fusion protein is present in a ribosome, said peptide tag is 
bound by a reagent that specifically binds said peptide tag, and said ribosome containing 
said ribosomal protein fusion protein binds mRNA; and wherein expression of said 

30 nucleotide sequence is controlled by a regulatory sequence such that said nucleotide 
sequence is expressed in a population of cells of said non-human transgenic animal. 

12. The non-human transgenic animal of claim 1 1 wherein said reagent does not 
specifically bind to any other component of said one or more cells. 

35 
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13. The non-human transgenic animal of claim 1 1 wherein the transgenic animal 
is a mouse. 

14. The non-human transgenic animal of claim 1 1 wherein the transgenic animal 
5 is a Drosophila. 

15. The non-human transgenic animal of claim 1 1 wherein the peptide tag is 
streptavidin and the reagent specifically binds streptavidin. 

10 16. A non-human transgenic animal comprising a transgene comprising a 

nucleotide sequence encoding a mRNA binding protein fusion protein, wherein the mRNA 
binding protein fusion protein comprises a mRNA binding protein, or functional fragment 
thereof, fused to a peptide tag, which peptide tag is not a mRNA binding protein or 
fragment thereof; wherein when said mRNA binding protein fusion protein is present in a 

1 5 ribosome, said peptide tag is bound by a reagent that specifically binds said peptide tag, and 
said ribosome containing said mRNA binding protein fusion protein binds mRNA; and 
wherein expression of said nucleotide sequence is controlled by a regulatory sequence such 
that said nucleotide sequence is expressed in a population of cells of said non-human 
transgenic animal. 

20 

17. The non-human transgenic animal of claim 16 wherein said mRNA binding 
protein is not polyA binding protein. 

18. The non-human transgenic animal of claim 16 wherein said reagent does not 
25 specifically bind to any other component of said one or more cells. 

19. The non-human transgenic animal of claim 16 wherein the transgenic animal 
is a mouse. 

30 20. The non-human transgenic animal of claim 16 wherein the transgenic animal 

is a Drosophila, 

21 . The non-human transgenic animal of claim 1 6 wherein the peptide tag is 
streptavidin and the reagent specifically binds streptavidin. 

35 
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22. A transgenic plant comprising a transgene comprising a nucleotide sequence 
encoding a ribosomal protein fusion protein, wherein the ribosomal protein fusion protein 
comprises a ribosomal protein, or functional fragment thereof, fused to a peptide tag, which 
peptide tag is not a ribosomal protein or portion thereof; wherein when said ribosomal 

5 protein fusion protein is present in a ribosome, said peptide tag is bound by a reagent that 
specifically binds said peptide tag, and said ribosome containing said ribosomal protein 
fusion protein binds mRNA; and wherein expression of said nucleotide sequence is 
controlled by a regulatory sequence such that said nucleotide sequence is expressed in a 
population of cells of said transgenic plant. 

10 

23. A transgenic plant comprising a transgene comprising a nucleotide sequence 
encoding a mRNA binding protein fusion protein, wherein the mRNA binding protein 
fusion protein comprises a mRNA binding protein, or functional fragment thereof, fused to 
a peptide tag, which peptide tag is not a mRNA binding protein or fragment thereof; 

1 5 wherein when said mRNA binding protein fusion protein is present in a ribosome, said 
peptide tag is bound by a reagent that specifically binds said peptide tag, and said ribosome 
containing said mRNA binding protein fusion protein binds mRNA; and wherein expression 
of said nucleotide sequence is controlled by a regulatory sequence such that said nucleotide 
sequence is expressed in a population of cells of said transgenic plant. 

20 

24. A transgenic yeast cell comprising a transgene comprising a nucleotide 
sequence encoding a ribosomal protein fusion protein, wherein the ribosomal protein fusion 
protein comprises a ribosomal protein, or functional fragment thereof, fused to a peptide 
tag, which peptide tag is not a ribosomal protein or portion thereof; wherein when said 

25 ribosomal protein fusion protein is present in a ribosome, said peptide tag is bound by a 
reagent that specifically binds said peptide tag, and said ribosome containing said ribosomal 
protein fusion protein binds mRNA; and wherein expression of said nucleotide sequence is 
controlled by a regulatory sequence such that said nucleotide sequence is expressed in said 
transgenic yeast cell. 

30 

25. A transgenic yeast cell comprising a transgene comprising a nucleotide 
sequence encoding a mRNA binding protein fusion protein, wherein the mRNA binding 
protein fusion protein comprises a mRNA binding protein, or functional fragment thereof, 
fused to a peptide tag, which peptide tag is not a mRNA binding protein or fragment 

35 thereof; wherein when said mRNA binding protein fusion protein is present in a ribosome, 
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said peptide tag is bound by a reagent that specifically binds said peptide tag, and said 
ribosome containing said mRNA binding protein fusion protein binds mRNA; and wherein 
expression of said nucleotide sequence is controlled by a regulatory sequence such that said 
nucleotide sequence is expressed in said transgenic yeast cell 

5 

26. A cultured cell comprising a transgene comprising a nucleotide sequence 
encoding a ribosomal protein fusion protein, wherein the ribosomal protein fusion protein 
comprises a ribosomal protein, or functional fragment thereof, fused to a peptide tag, which 
peptide tag is not a ribosomal protein or portion thereof; wherein when said ribosomal 

10 protein fusion protein is present in a ribosome, said peptide tag is bound by a reagent that 
specifically binds said peptide tag, and said ribosome containing said ribosomal protein 
fusion protein binds mRNA; and wherein expression of said nucleotide sequence is 
controlled by a regulatory sequence such that said nucleotide sequence is expressed in said 
cultured cell. 

15 

27. The cultured cell of claim 26 which is a mammalian cell. 

28. A cultured cell comprising a transgene comprising a nucleotide sequence 
encoding a mRNA binding protein fusion protein, wherein the mRNA binding protein 

20 fusion protein comprises a mRNA binding protein, or functional fragment thereof, fused to 
a peptide tag, which peptide tag is not a mRNA binding protein or fragment thereof; 
wherein when said mRNA binding protein fusion protein is present in a ribosome, said 
peptide tag is bound by a reagent that specifically binds said peptide tag, and said ribosome 
containing said mRNA binding protein fusion protein binds mRNA; and wherein expression 

25 of said nucleotide sequence is controlled by a regulatory sequence such that said nucleotide 
sequence is expressed in said cultured cell. 

29. The cultured cell of claim 28 which is a mammalian cell. 

30 30. A method of isolating mRNA from a population of cells from the transgenic 

animal of claim 1 1, wherein one or more cells in said population express said ribosomal 
protein fusion protein, said method comprising contacting a lysate or fraction of said 
population of cells with a reagent which binds to said peptide tag; isolating the ribosomes 
containing said peptide tag bound to said reagent from said lysate or fraction; and isolating 

35 said mRNA from said ribosomes. 
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31. A method of isolating mRNA from a population of cells from the transgenic 
animal of claim 16, wherein one or more cells in said population express said mRNA 
binding protein fusion protein, said method comprising contacting a lysate or fraction of 
said population of cells with a reagent which binds to said peptide tag; isolating the mRNA 

5 binding protein fusion proteins bound to said reagent from said lysate or fraction; and 
isolating said mRNA from said mRNA binding protein fusion protein. 

32. A method of isolating mRNA from a population of cells from the transgenic 
plant of claim 22, wherein one or more cells in said population express said ribosomal 

1 0 protein fusion protein, said method comprising contacting a lysate or fraction of said 

population of cells with a reagent which binds to said peptide tag; isolating the ribosomes 
containing said ribosomal fusion protein bound to said reagent from said lysate or fraction; 
and isolating said mRNA from said ribosomes. 

15 33. A method of isolating mRNA from a population of cells from the transgenic 

plant of claim 23, wherein one or more cells in said population express said mRNA binding 
protein fusion protein, said method comprising contacting a lysate or fraction of said 
population of cells with a reagent which binds to said peptide tag; isolating said mRNA 
binding protein fusion protein bound to said reagent from said lysate or fraction; and 

20 isolating said mRNA from said mRNA binding protein fusion protein. 

34. A method of isolating mRNA from the transgenic yeast cell of claim 24, said 
method comprising contacting a lysate or fraction of said transgenic yeast cell with a reagent 
which binds to said peptide tag; isolating the ribosomes containing said ribosomal protein 

25 fusion protein bound to said reagent from said lysate or fraction; and isolating said mRNA 
from said ribosomes. 

35. A method of isolating mRNA from the transgenic yeast cell of claim 25, said 
method comprising contacting a lysate or fraction of said transgenic yeast cell with a reagent 

30 which binds to said peptide tag; isolating said mRNA binding protein fusion protein bound 
to said reagent from said lysate or fraction; and isolating said mRNA from said mRNA 
binding protein fusion protein. 

36. A method of isolating mRNA from the cultured cell of claim 26, said method 
35 comprising contacting a lysate or fraction of said cultured cell with a reagent which binds to 
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said peptide tag; isolating the ribosomes containing said ribosomal fusion protein bound to 
said reagent from said lysate or fraction; and isolating said mRNA from said ribosomes. 

37. A method of isolating mRNA from the cultured cell of claim 28, said method 
5 comprising contacting a lysate or fraction of said cultured cell with a reagent which binds to 
said peptide tag; isolating said mRNA binding protein fusion protein bound to said reagent 
from said lysate or fraction; and isolating said mRNA from said mRNA binding protein 
fusion protein. 

10 38. An isolated ribosome-reagent complex comprising 

(1) a tagged ribosome, comprising a ribosomal protein fusion protein, said 
ribosomal protein fusion protein comprising a ribosomal protein, or 
functional fragment thereof, and a peptide tag, which peptide tag is not a 
ribosomal protein or fragment thereof; 
15 (2) a mRNA bound to said ribosomal protein fusion protein; and 

(3) a reagent specifically bound to said peptide tag. 

39. The isolated ribosome-reagent complex of claim 38 wherein said reagent is 
bound to a solid support. 

20 

40. An isolated mRNA binding protein-mRNA-reagent complex comprising 
(1) a tagged mRNA binding protein comprising a mRNA binding protein, or 

functional fragment thereof, and a peptide tag, which peptide tag is not a 
mRNA binding protein or fragment thereof; 
25 (2) a mRNA bound to said tagged mRNA binding protein; and 

(3) a reagent specifically bound to said peptide tag. 

41 . The isolated mRNA binding protein-mRNA-reagent complex of claim 40 
wherein said reagent is bound to a solid support. 

30 
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SEQUENCE LISTING 

<110> Renovis, Inc. 

Heintz, Nathaniel 
Serafini , Tito 
Shy j an, Andrew 

<120> METHOD FOR ISOLATING CELL-TYPE SPECIFIC mRNAs 

<130> 10239-024-228 

<140> 
<141> 

<150> 60/340,689 

<151> 2001-10-29 

<160> 17 

<170> Patentln version 3.0 

<210> 1 

<211> 9 

<212> PRT 

<213> Influenza virus 

<400> 1 

Tyr Pro Tyr Asp Val Pro Asp Tyr Ala 

1 " 5 

<210> 2 

<211> 10 

<212> PRT 

<213> Homo sapiens 

<400> 2 

Glu Gin Lys Leu lie Ser Glu Glu Asp Leu 
1 5 10 

<210> 3 

<211> 6 

<212> PRT 

<213> Bluetongue virus 

<400> 3 

Gin Tyr Pro Ala Leu Thr 

1 5 

<210> 4 

<211> 8 

<212> PRT 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Flag peptide used for 
identification 

<400> 4 

Asp Tyr Lys Asp Asp Asp Asp Lys 

1 5 

<210> 5 

<211> 9 

<212> PRT 

<213> Artificial 
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<220> 

<223> Description of Artificial Sequence: Strep-tag peptide used for 
identification 



<400> 5 

Ala Trp Arg His Pro Gin Phe Gly Gly 

1 5 

<210> 6 

<211> 551 

<212> DNA 

<213> Artificial 



<220> 

<223> Description of Artificial Sequence: Shuttle Vector 



<400> 6 

taacgttact 

ttccaccata 

gacgagcatt 

cgtgaaggaa 

ttgcaggcag 

ataagataca 

ggaaagagtc 

gtactccatt 

tcgaggttaa 

acaccatgat 



ggccgaagcc 
ttgccgtctt 
cctaggggtc 
gcagttcctc 
cggaaccccc 
cctgcaaagg 
aaatggctct 
gtatgggatc 
aaaaacgtct 
a 



gcttggaata 
ttggcaatgt 
tttcccctct 
tggaagcttc 
cacctggcga 
cggcacaacc 
cctaagcgta 
tgatctgggg 
aggccccccg 



aggccggtgt 
gagggcccgg 
cgccaaagga 
ttgaagacaa 
caggtgcctc 
ccagtgccac 
ttcaacaagg 
cctcggtgca 
aaccacgggg 



gcgtttgtct 
aaacctggcc 
atgcaaggtc 
acaacgtctg 
tgcggccaaa 
gttgtgagtt 
ggctgaagga 
catgctttac 
acgtggtttt 



atatgttatt 
ctgtcttctt 
tgttgaatgt 
tagcgaccct 
agccacgtgt 
ggatagttgt 
tgcccagaag 
atgtgtttag 
cctttgaaaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
551 



<210> 7 

<211> 37 

<212> DNA 

<213> Artificial 



<220> 

<223> Description of Artificial Sequence: PGR Primer 
<400> 7 

ggaattcatt caagatgaag ctgaacatct ccttccc 37 

<210> 8 

<211> 43 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: PCR Primer 
<400> 8 

gcggccgctt ttctgactgg attcagactt agaagtagaa get 43 

<210> 9 

<211> 37 

<212> DNA 

<213> Artificial 



<220> 

<223> Description of Artificial Sequence: PCR Primer 
<400> 9 

ggaattcccg gcgacatggc taaacgcacc aagaagg 37 



<210> 10 
<211> 39 
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<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: PCR Primer 
<400> 10 

gcggccgctc tggtctttca gttccttcag tcttctgat 

<210> 11 

<211> 37 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: PCR Primer 
<400> 11 

ggaattcgcg cgcaacagcc atggctttta aggatac 

<210> 12 

<211> 39 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: PCR Primer 
<400> 12 

gcggccgcta gcatctgcaa tggtgacttc cacctcaac 

<210> 13 

<211> 37 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: PCR Primer 
<400> 13 

ggaattcggc atcatggctg cccttcggcc tctggtg 

<210> 14 

<211> 36 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: PCR Primer 
<400> 14 

gcggccgctt tcattctctt cgctg.cgtag cctggc 

<210> 15 

<211> 39 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: PCR Primer 



<400> 15 
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ggccgcagcg cttggagcca cccgcagttc gaaaaataa 

<210> 16 

<211> 39 

<212> DNA 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: PGR Primer 
<400> 16 

tcgattattt ttcgaactgc gggtggctcc aagcgctgc 

<210> 17 

<211> 8 

<212> PRT 

<213> Artificial 

<220> 

<223> Description of Artificial Sequence: Tag used for identification 



<400> 17 

Trp Ser His Pro Gin Phe Glu Lys 
1 5 



